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TITLE: Method for stable gene-amplification in a bacterial host cell 

FIELD OF INVENTION 

In the biotech Industry it is desirable to construct polypeptide production strains 
5 having several copies of a gene of Interest stably chromosomally integrated, without leaving 
antibiotic resistance marker genes in the strains. 

This invention relates to bacterial host cells comprising at least two copies of an 
amplification unit In its genome, said amplification unit comprising: i) at least one copy of a 
gene of interest, and ii) an expressible conditionally essential gene, wherein the conditionally 
10 essential gene is either promoterless or transcribed from a heterologous promoter having an 
activity substantially lower than the endogenous promoter of said conditionally essential 
gene, and wherein the conditionally essential gene if not functional would render the cell 
auxotrophic for at least one specific substance or unable to utilize one or more specific sole 
carbon source; methods for producing a protein using the cell of the invention, and methods 
15 for constructing the cell of the invention. 

BACKGROUND OF THE INVENTION 

In the industrial production of polypeptides it is of interest to achieve a product yield 
as high as possible. One way to increase the yield is to increase the copy number of a gene 
20 encoding a polypeptide of interest. This can be done by placing the gene on a high copy 

number plasmid, however plasmids are unstable and are often lost from the host cells if there 
is no selective pressure during the cultivation of the host cells. Another way to increase the 
copy number of the gene of interest is to integrate it into the host cell chromosome in multiple 
copies. 

25 The present day public debate concerning the industrial use of recombinant DMA 

technology has raised some questions and concerns about the use of antibiotic resistance 
marker genes. Antibiotic marker genes are traditionally used as a means to select for strains 
carrying multiple copies of both the marker genes and an accompanying expression cassette 
coding for a polypeptide of industrial interest In order to comply with the current demand for 
30 recombinant production host strains devoid of antibiotic markers, we have looked for possible 
alternatives to the present technology that win allow substitution of the antibiotic markers we 
use today with non-antiblotlc marker genes. 

WO 02/00907 (Novozymes, Denmark) discloses a method for stable chromosomal 
- multi-copy integration of genes into a production host celt in specific well-defined sites. It is 
35* disclosed to first render a recipient cell deficient by inactivating one or more conditionally 
* essential gene, e.g., to make the cell auxotrophic for an amino acid. A gene of Interest may 
then be integrated into the chromosome along with a DNA sequence which complements the 
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deficiency of the cell cell, thus making the resulting ceH selectable; the Bacttlus Ilcheniformis 
metC gene is disclosed as a conditionally essential marker herein. 

WO 01/90393 (Novozymes, Denmark) discloses a method for increasing the gene 
copy number in a host cell by gene-ampllfication. without leaving antibiotic resistance 
5 maifcers behind in the host cefl. The disclosed method relies on rendering a specific type of 
conditionally essential chromosomal gene of the host cell non-functional, thereby rendering 
the cell susceptible to an inhibitory compound produced endogenously by the cell when 
cultivated in the presence of a precursor. A single amplification unit comprising the gene of 
Interest, and a DNA sequence, which when integrated into the chromosome complements 
10 the non-functional conditional essential chromosomal gene, is integrated into the 

chromosome. This method relies on the endogenous production of an inhibitory compound 
by the host cell in order to achieve gene amplification in a similar manner as to when • 
classical antibiotic selection markers are used for gene amplification; the presence of the 
inhibitory compound gives a survival advantage to those cells carrying duplications of the 
15 amplification unit 

In order to provide recombinant production strains devoid of antibiotic resistance 
markers, it remains of Industrial interest to find new methods to stably integrate genes in 
multiple copies into host cell chromosomes. Even incremental improvements of existing 
methods or mere alternatives are of considerable interest to the industry. 

20 

SUMMARY OF THE INVENTION 

The problem to be solved by the present invention is to provide alternative host cells 
comprising multiple copies of a gene of interest, which cells are devoid of antibiotic markers, 
for use in the industrial production of polypeptides in high yields. 
25 The solution is based on the observation that an amplification unit can be integrated 

into the chromosome of a host cell, and subsequently be amplified, without the use of 
classical antibiotic markers, antibiotics, or endogenously produced inhibitory compounds. 

In traditional amplification protocols, higher gene expression is a result of 
duplications of the antibiotic resistance marker gene, duplications which are selected in 
30 stepwise cultivation and selection rounds by adding increasing amounts of the antibiotic 
compound to the cultivation medium in each cultivation step. 

A cell which has become auxotrophic, e.g., due to a non-functional conditionally 
essential gene, would normally be complemented back to the prototrophic phenotype by the 
• integration (or restoration) in the chromosome of even one single functional copy of the non- 
35' functional gene. Since normally only one copy is needed, such genes have not previously 
" been attractive candidates for amplification purposes. 

However, the present inventors lowered the expression-level of a non-antibiotic 

conditionally essential gene by decreasing the promoter activity, so that more than one 

2 
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functional copy of the gene would be advantageous to a deficient host cell. The integration of 
an amplification unit comprising such a low-level expression conditionally essential gene, into 
a host ceil deficient for the same gene, reproducibly resulted In genomic duplications of the 
integrated amplification unit, comparable to what has been observed when using traditional 
5 amptifiabte antibiotic markers » 

In feet, this invention provides the means for controlling the level of gene 
expression, i.e., copy-number, in a host cell. By choosing carefully the strength of the 
heterologous promoter expressing the conditionally essential marker gene in the 
amplification unit, the optimal copy-number of the amplification unit can be adjusted up or 
10 down, depending on the desired expression level of the gene of interest also comprised in 
the unit 

Accordingly, in a first aspect the invention relates to a bacterial host cell comprising 
at least two copies of an amplification unit in its genome, said amplification unit comprising: 
i) at least one copy of a gene of interest, and 
is ii) an expressible conditionally essential gene, wherein the conditionally essential gene 

is either promoteriess or transcribed from a heterologous promoter having an activity 
substantially lower than the endogenous promoter of said conditionally essential 
gene, and 

wherein the conditionally essential gene if not functional would render the cell auxotrophic for 
20 at least one specific substance or unable to utilize one or more specific sole carbon source. 
In a second aspect, the invention relates to a method for producing a protein 
encoded by a gene of interest, comprising 

a) culturing a bacterial host cell comprising at least two duplicated copies of an 
amplification unil in its genome, the amplification unit comprising: 

25 i) at least one copy of the gene of interest, and 

ii) an expressible conditionally essential gene, wherein the conditionally essential 
gene is either promoteriess or transcribed from a heterologous promoter having 
an activity substantially lower than the endogenous promoter of said 
conditionally essential gene, 
30 wherein the conditionally essential gene if not functional would render the cell auxotrophic 
for at least one specific substance or unable to utilize one or more specific sole carbon 
source; and 

b) recovering the protein. 

In a final aspect, the Invention also relates to a method for producing a bacterial cell 
35" comprising two or more ampfified chromosomal copies of a gene of interest, the method 
comprising: 

a) providing a bacterial cell comprising at least one copy of an amplification unit, the unit 
comprising: 

3 
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I) at least one copy of the gene of interest, and 

n) an expressible functional copy of a conditionally essential gene, which is either 
promotertess or transcribed from a heterologous promoter having an activity 
substantially lower than the endogenous promoter of said conditionally essential 
5 gene, 

wherein the conditionally essentia) gene if not functional would render the cell auxotrophic 
for at least one specific substance or unable to utilize one or more specific sole carbon 
source; 

c) cultivating the cell under conditions suitable for growth in a medium deficient of said at 
10 least one specific substance and/or with said one or more specific sole cartoon source, 
thereby providing a growth advantage to a cell in which the amplification unit has been 
duplicated in the chromosome; and 
c) selecting a cell wherein the amplification unit has been duplicated in the chromosome, 
whereby two or more amplified chromosomal copies of the gene of interest were 
15 produced. 

It is envisioned that all the preferred embodiments of the cell of the invention that 
are shown herein would be suitable for use in the methods of the second and third aspects of 
the invention 

20 DEFINITIONS 

In accordance with the present invention there may be employed conventional 
molecular biology, microbiology, and recombinant DNA techniques within the skill of the art 
Such techniques are explained fully in the literature. See. e.g., Sambrook, Fritsch & Maniatis, 
Molecular Cloning: A Laboratory Manual, Second Edition (1989) Cold Spring Harbor 
25 Laboratory Press, Cold Spring Harbor, New York (herein "Sambrook et at, 1989") DNA 
Cloning: A Practical Approach, Volumes I and 11 /D.N. Glover ed. 1985); Oligonucleotide 
Synthesis (M.J. Gait ed. 1984); Nucleic Acid Hybridization (B.D. Hames & S.J. Higglns eds 
(1955)); Transcription And Translation (B.D. Hames & SJ. Higgins, eds. (1984)); Animal Ceil 
Culture (R.I. Freshney, ed. (1986»; Immobilized Cells And Enzymes (IRL Press, (1986)); B. 
30 Perbal. A Practical Guide To Molecular Cloning (1 984). 

A "polynucleotide" is a single- or double-stranded polymer of deoxyribonucleotide or 
ribonucleotide bases read from the 5' to the 3' end. Polynucleotides include RNA and DNA, 
and may be isolated from natural sources, synthesized in vitro, or prepared from a 
- combination of natural and synthetic molecules. 
35' a "nucleic acid molecule" or "nucleotide sequence" refers to the phosphate ester 

* polymeric form of ribonudeosides (adenosine, guanosine, uridine or cytidine; "RNA 

molecules - ) or deoxyribonudeosides (deoxyadenoslne. deoxyguanosine, deoxythymidine, or 
deoxycytidine: "DNA molecules") in either single stranded form, or a double-stranded helix. 
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Double stranded DNA-DNA, DNA-RNA and RNA-RNA helices are possible. The term nucleic 
acid molecule, and in particular DNA or RNA molecule, refers only to the primaiy and 
secondary structure of the molecule, and does not limit it to any particular tertiary or 
quaternary forms. Thus, this term includes double-stranded DNA found, inter alia, in linear or 
5 circular DNA molecules (e.g., restriction fragments), plasmids, and chromosomes. In 
discussing the structure of particular double-stranded DNA molecules, sequences may be 
described herein according to the norma) convention of giving only the sequence in the 5* to 
3' direction along the nontranscribed strand of DNA (i.e., the strand having a sequence 
homologous to the mRNA). A "recombinant DNA molecule" \s a DNA molecule that has 
10 undergone a molecular biological manipulation. 

A nucleic acid molecule is "hybridizable" to another nucieic add molecule, such as a 
cDNA, genomic DNA, or RNA, when a single stranded form of the nucleic acid molecule can 
anneal to the other nucleic add molecule under the appropriate conditions of temperature 
and solution ionic strength (see Sambrook et al., supra). The conditions of temperature and 
15 ionic strength determine the •stringency" of the hybridization. 

A DNA "coding sequence" or an "open reading frame {ORF)~ is a double-stranded 
DNA sequence which is transcribed and translated into a polypeptide in a ceff in vitro or in 
vivo when placed under the control of appropriate regulatory sequences. The boundaries of 
the coding sequence are determined by a start codon at the 5' (amino) terminus and a 
20 translation stop codon at the 3 V (carboxyl) terminus. A coding sequence can include, but is 
not limited to, prokaryotic sequences, cDNA from eukaryotlc mRNA, genomic DNA 
sequences from eukaryotic (e.g., mammalian) DNA, and even synthetic DNA sequences. If 
the coding sequence is intended for expression in a eukaryotic cell, a polyadenylation signal 
and transcription termination sequence will usually be located 3 V to the coding sequence. 
25 An expression vector is a DNA molecule, linear or circular, that comprises a 

segment encoding a polypeptide of interest operably linked to additional segments that 
provide for its transcription. Such additional segments may include promoter and terminator 
sequences, and optionally one or more origins of replication, one or more selectable 
markers, an enhancer, a polyadenylation signal, and the like. Expression vectors are 
30 generally derived from plasmid or viral DNA, or may contain elements of both. 

Transcriptional and translational control sequences are DNA regulatory sequences, 
such as promoters, enhancers, terminators, and the like, that provide for the expression of a 
coding sequence in a host cell In eukaryotic cells, polyadenylation signals are control 
• sequences. 

35 * A "secretory signal sequence" is a DNA sequence that encodes a polypeptide (a 

^secretory peptide" that, as a component of a larger polypeptide, directs the larger 
polypeptide through a secretory pathway of a cell in which it is synthesized. The larger 

5 
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polypeptide is commonly cleaved to remove the secretory peptide during transit through the 
secretory pathway. 

The term "promoter" is used herein for Its art-recognized meaning to denote a 
portion of a gene containing DNA sequences that provide for the binding of RNA polymerase 
5 and initiation of transcription. Promoter sequences are commonly, but not always, found in 
the 5' non-coding regions of genes. 

A chromosomal gene is rendered non-functional if the polypeptide that the gene 
encodes can no longer be expressed in a functional form. Such non-functionality of a gene 
can be induced by a wide variety of genetic manipulations as known in the art, some of which 
10 are described in Sambrook et aL vide supra. Partial deletions within the ORF of a gene will 
often render the gene non-functional, as will mutations. 

The term 'an expressible copy of a chromosomal gene" is used herein as meaning a 
copy of the ORF of a chromosomal gene, wherein the ORF can be expressed to produce a 
fully functional gene product. The expressible copy may not be transcribed from the native 
15 promoter of the chromosomal gene, it may Instead be transcribed from a foreign or 
heterologous promoter, or It may indeed be pnomotertess and expressed only by 
transcriptional read-through from a gene pre&ent upstream of the 5 r end of the ORF. 
Transcriptional read-through is intended to have the same meaning here as the generally 
recognized meaning in the art. 
20 "Operably finked", when referring to DNA segments, indicates that the segments are 

arranged so that they function in concert for their intended purposes, e.g. transcription 
initiates in the promoter and proceeds through the coding segment to the terminator, 

A coding sequence is "under the control" of transcriptional and translation^ control 
sequences in a cell when RNA polymerase transcribes the coding sequence into mRNA, 
25 which is then trans-RNA spliced and translated into the protein encoded by the coding 
sequence. 

"Heterologous" DNA refers to DNA not naturally located in the cell or in a 
chromosomal site of the cell Preferably, the heterologous DNA induces a gene foreign to 
the cell, 

30 As used herein the term 'nucleic acid construct" is intended to indicate any nucleic 

acid molecule of cDNA, genomic DNA, synthetic DNA or RNA origin. The term "construct" 
is intended to indicate a nucleic acid segment which may be single- or double-stranded, 
and which may be based on a complete or partial naturally occurring nucleotide sequence 
encoding a polypeptide of Interest The construct may optionally contain other nucleic acid 

35 segments. 

The nucleic acid construct of the invention encoding the polypeptide of the invention 
may suitably be of genomic or cDNA origin, for instance obtained by preparing a genomic or 

6 
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cDNA library and screening for DNA sequences coding for all or part of the polypeptide by 
hybridization using synthetic oligonucleotide probes in accordance with standard techniques 
(cf. Sambrook et al., supra). 

The nucleic acid construct of the invention encoding the polypeptide may also be 
5 prepared synthetically by established standard methods, e.g. the phosphoamidite method 
described by Beaucage and Caruthers, Tetrahedron Letters 22 (1981), 1859 - 1869, orthe 
method described by Matthes et al.. EMBO Journal 3 (1984), 801 - 80S. According to the 
phosphoamidite method, oligonucleotides are synthesized, e.g. in an automatic DNA 
synthesizer, purified, annealed, iigated and cloned in suitable vectors. 

to Furthermore, the nucleic add construct may be of mixed synthetic and genomic, 

mixed synthetic and cDMA or mixed genomic and cDNA origin prepared by ligating 
fragments of synthetic, genomic or cDIMA origin (as appropriate), the fragments 
corresponding to various parts of (he entire nucleic acid construct, in accordance with 
standard techniques. The nucleic acid construct may also be prepared by polymerase chain 

15 reaction using specific primers, for instance as described in US 4,683,202 or Saiki et aL. 
Science 239 (1988), 487 - 491. 

The term nucleic acid construct may be synonymous with the term "expression 
cassette" when the nucleic acid construct contains the control sequences necessary for 
expression of a coding sequence of the present invention 

20 The term "control sequences" is defined herein to include aft components which are 

necessary or advantageous for expression of the coding sequence of the nucleic acid 
sequence. E ach control sequence may be native or foreign to the nucleic acid sequence 
encoding the polypeptide. S uch control sequences include, but are not limited to, a leader, 
a polyadenylation sequence, a propeptide sequence, a promoter, a signal sequence, and a 

25 transcription terminator. At a minimum, the control sequences include a promoter, and 
transcriptional and translallonal stop signals. The control sequences may be provided wfth 
linkers for the purpose of introducing specific restriction sites facilitating ligation of the control 
sequences with the coding region of the nucleic acid sequence encoding a polypeptide. 

The control sequence ma/ be an appropriate promoter sequence, a nucleic add 

30 sequence which is recognized by a host cell for expression of the nucleic acid sequence. 
The promoter sequence contains transcription and translation control sequences which 
mediate the expression of the polypeptide. The promoter may be any nucleic acid 
- sequence which shows transcriptional activity in the host cell of choice and may be obtained 
from genes encoding extracellular or intracellular polypeptides either homologous or 

35 - heterologous to the host ceR. 

The control sequence may also be a suitable transcription terminator sequence, a 
sequence recognized by a host cell to terminate transcription. The terminator sequence is 
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operabiy linked to the 3' terminus of the nucleic acid sequence encoding the polypeptide. 
Any terminator which is functional in the host celi of choice may be used in the present 
Invention. 

The control sequence may also be a pofyadenylation sequence, a sequence which 

5 is operabiy linked to the 3' terminus of the nucleic acid sequence and which, when 
transcribed, is recognized by the host cell as a signal to add polyadenosine residues to 
transcribed mRNA. Any polyadenytation sequence which Is functional in the host cell of 
choice may be used in the present invention. 

The control sequence may also be a signal peptide coding region, which codes for 

to an amino add sequence Jinked to the amino terminus of the polypeptide which can direct the 
expressed polypeptide into the cell's secretory pathway of the host cell- The 5* end of the 
coding sequence of the nucleic acid sequence may inherently contain a signal peptide 
coding region naturally linked in translation reading frame with the segment of the coding 
region which encodes the secreted polypeptide. Alternatively, the 5' end of the coding 

15 sequence may contain a signal peptide coding region which Is foreign to that portion of the 
coding sequence which encodes the secreted polypeptide. A foreign signal peptide coding 
region may be required where the coding sequence does not normally contain a signal 
peptide coding region. Alternatively, the foreign signal peptide coding region may simply 
replace the natural signal peptide coding region in order to obtain enhanced secretion of the 

20 [enzyme] relative to the natural signal peptide coding region normally associated with the 
coding sequence. The signal peptide coding region may be obtained from a glucoamylase 
or an amylase gene from an Aspergillus species, a lipase or proteinase gene from a 
Rhizomucor species, the gene for the alpha-factor from Saccharomyces cerevisiae, an 
amylase or a protease gene from a Bacillus species, or the calf preprochymosin gene. 

25 However, any signal peptide coding region capable of directing the expressed polypeptide 
into the secretory pathway of a host cell of choice may be used In the present invention. 

The control sequence may also be a propeptide coding region, which codes for an 
amino acid sequence positioned at the amino terminus of a polypeptide* The resultant 
polypeptide is known as a proenzyme or propolypeptide (or a zymogen In some cases). A 

30 propolypeptide is generally Inactive and can be converted to mature active polypeptide by 
catalytic or autocatalytic cleavage of the propeptide from the propolypeptide. The 
propeptide coding region may be obtained from the Bacillus st/bffte alkaline protease gene 
(aprE). the Bacillus subtilis neutral protease gene (nprT), the Saccharomyces cerevisiae 
- alpha-factor gene, or the Mycefiophthora thermophilum laccase gene (WO 95/33836). 

35 It may also be desirable to add regulatory sequences which allow the regulation of 

the expression of the polypeptide relative to the growth of the host ceil. E xamples of 
regulatory systems are those which cause the expression of the gene to be turned on or off 
in response to a chemical or physical stimulus, including the presence of a regulatory 
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compound. R egulatoty systems in prokaryotic systems would include the lac, tac. and trp 
operator systems. I n yeast, the ADH2 system or GAL1 system may be used. I n 
filamentous fungi, the TAKA alpha-amylase promoter, Aspergillus n/ger glucoamylase 
promoter, and the Aspergillus oryzae glucoamylase promoter may be used as regulatory 
5 sequences. Other examples of regulatory sequences are those which allow for gene 

amplification. In eukaryotic systems, these Include the dihydrofolate reductase gene which 
is amplified in the presence of methotrexate, and the rnetaflothionein genes which are 
amplified with heavy metals. In these cases, the nucleic add sequence encoding the 
polypeptide would be placed in tandem with the regulatory sequence. 
10 Examples of suitable promoters for directing the transcription of the conditionally 

essential gene(s) of the present invention, especially in a bacterial host cell, are the 
promoters obtained from the E. coli lac operon, the Streptomyoes coe/fco/or agarase gene 
(dagA). the Bacillus subtitis levansucrase gene (sacB). the Bacillus subtilis alkaline protease 
gene, the Bacillus ticheniformis alpha-amylase gene (arnyL), the Bacillus stearotbermophilus 
is maltogenic amylase gene (amyM), the Bacillus amyloiiquefaciens alpha-amylase gene 
(amyQ), the Bacillus amytofiquefatians BAM amylase gene, the Bacillus ticheniformis 
penicillinase gene (penP), the Bacillus subtilis xylA and xylB genes, and the prokaryotic beta- 
lactamase gene (Villa-Kamanoff et al., 1978, Proceedings of the National Academy of 
Sciences USA 75:3727-3731 >, as well as the tac promoter (DeBoer et al.. 1983, Proceedings 
20 of the National Academy of Sciences USA 80:21-25). Further promoters are described in 
"Useful proteins from recombinant bacteria" in Scientific American. 1980, 242:74-94; and in 
Sambrooketa!., 1989, supra. 

The term "auxotrophic" in the present context means that the auxotrophic cell 
requires at least one specific substance for growth and metabolism that the parental 
25 organism was able to synthesize on its own. The term is used with respect to organisms, 
such as strains of bacteria, that can no longer synthesize the substance{s} because of 
mutational changes. 

An effective signal peptide coding region for bacterial host cells is the signal peptide 
coding region obtained from the maltogenic amylase gene from Bacillus NGIB 11837, the 

30 Bacillus stearothermophiius alpha-amylase gene, the Bacillus ticheniformis subtilisin gene, 
the Bacillus ticheniformis beta-lactamase gene, the Bacillus stearotbermophilus neutral 
proteases genes (nprT, nprS, nprM), and the Bacillus subtilis PrsA gene. F urther signal 
peptides are described by Simonen and Pafva, 1993, Microbiological Reviews 57:109-137. 
The present invention also relates to recombinant expression vectors comprising a 

35 nucleic acid sequence of the present invention, a promoter, and transcriptional and 

translations! stop signals. The various nucleic acid and control sequences described above 

may be joined together to produce a recombinant expression vector which may include one 

or more convenient restriction sites to allow for insertion or substitution of the nucleic acid 

9 
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sequence encoding the polypeptide at such sites. Alternatively, the nucleic acid sequence 
of the present invention may be expressed by inserting the nucleic acid sequence or a 
nucleic acid construct comprising the sequence into an appropriate vector for expression. In 
creating the expression vector, the coding sequence is located in the vector so that the 

5 coding sequence is openably finked with the appropriate control sequences for expression, 
and possibly secretion. 

The recombinant expression vector may be any vector (e.g., a plasmid or virus) 
which can be conveniently subjected to recombinant ON A procedures and can bring about 
the expression of the nucleic acid sequence. The choice of the vector will typically depend 

10 on the compatibility of the vector with the host cell into which the vector Is to be introduced. 
The vectors may be linear or closed circular plasmids. The vector may be an autonomously 
replicating vector, i.e., a vector which exists as an extrachromosomal entity, the replication of 
which is independent of chromosomal replication, e.g., a plasmid, an extrachromosomal 
element, a minichromosome, or an artificial chromosome. The vector may contain any 

15 means for assuring self-replication. A Iternativety. the vector may be one which, when 

introduced into the host cell, is integrated into the genome and replicated together with the 
chnomosome(s) into which it has been integrated. The vector system may be a single vector 
or plasmid or two or more vectors or plasmids which together contain the total DNA to be 
introduced into (he genome of the host cell, or a transposon. 

20 The vectors of the present invention preferably contain one or more selectable 

markers which permit easy selection of transformed oells. A selectable marker ?s a gene the 
product of which provides for biodde or viral resistance, resistance to heavy metals, 
prototrophy to auxotrophs, and the like. 

Antibiotic selectable markers confer antibiotic resistance to such antibiotics as 

25 ampicillin, kanamycin, chloramphenicol, tetracycline, neomycin, hygromycin or methotrexate. 
Suitable markers for yeast host cells are ADE2, HIS3, LEU2, LYS2, MET3, TRP1, and 
URA3. 

The vectors of the present invention preferably contain an elements) that permits 
stable integration of the vector, or of a smaller part of the vector, into the host cell genome or 
30 autonomous replication of the vector in the cell independent of the genome of the cell. 

The vectors, or smaller parts of the vectors such as amplification units of the present 
invention, may be integrated into the host cell genome when introduced into a host cell. F or 
chromosomal Integration, the vector may rely on the nucleic acid sequence encoding the 
polypeptide or any other element of the vector for stable Integration of the vector into the 
35 genome by homologous or nonhomologous recombination. 

Alternatively, the vector may contain additional nucleic acid sequences for directing 

integration by homologous recombination into the genome of the host celt. The additional 

nucleic acid sequences enable the vector to be integrated into the host cell genome at a 

10 
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precise location(s) in the chnomosome(s). To increase the likelihood of integration at a 
precise location, the integrational elements should preferably contain a sufficient number of 
nucleic acids, such as 100 to 1 ,500 base pairs, preferably 400 to 1 ,500 base pairs, and most 
preferably 800 to 1 ,500 base pains, which are highly homologous with the corresponding 

5 target sequence to enhance the probability of homologous recombination. The integrational 
elements may be any sequence that is homologous with the target sequence in the genome 
of the host cell. F urtheimore, the Integrational elements may be non-encoding or encoding 
nucleic acid sequences; specific examples of encoding sequences suitable for site-specific 
integration by homologous recombination are given in WO 02/00907 (Novozymes, 

10 Denmark), which is hereby incorporated by reference in its totality. 

On the other hand, the vector may be integrated into the genome of the host ceil by 
non-homologous recombination. These nucleic add sequences may be any sequence that 
is homologous with a target sequence in the genome of the host cell, and, furthermore, may 
be non-encoding or encoding sequences. The copy number of a vector, an expression 

15 cassette, an amplification unit, a gene or indeed any defined nucleotide sequence is the 
number of identical copies that are present in a host cell at any time, A gene or another 
defined chromosomal nucleotide sequence may be present in one, two, or more copies on 
the chromosome. An autonomously replicating vector may be present in one, or several 
hundred copies per host cell. 

20 An amplification unit of the invention is a nucleotide sequence that can integrate into 

the chromosome of a host cell, whereupon it can increase in number of chromosomally 
Integrated copies by duplication of multiplication. The unit comprises an expression cassette 
as defined herein comprising at least one copy of a gene of interest and an expressable copy 
of a chromosomal gene, as defined herein, of the host cell. When the amplification unit is 

23 integrated into the chromosome of a host cell, it is defined as that particular region of the 
chromosome which is prone to being duplicated by homologous recombination between two 
directly repeated regions of DMA. The precise border of the amplification unit with respect to 
the flanking DMA is thus defined functionally, since the duplication process may indeed 
duplicate parts of the DNA which was introduced into the chromosome as well as parts of the 

30 endogenous chromosome itself, depending on the exact site of recombination within the 
repeated regions. This principle is illustrated in Janniere etaL (1985, Stable gene 
ampfification in the chromosome of Bacillus subtilis. Gene, 40: 47-55). which is incorporated 
herein by reference. 

F c autonomous replication, the vector may further comprise an origin of replication 
35 enabling the vector to repficate autonomously in the host ce(J in question. Examples of 
bacterial origins of replication are the origins of replication of pJasmids pBR322 r pUC1 9, 
PACYC177, PACYC184, pUBHO, pE194, pTA1Q60. and pAMbetal. E xamples of origin of 
replications for use in a yeast host cell are the 2 micron origin of replication, the combination 

11 
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of CEN6 and ARS4, and the combination of CEN3 and ARS1 . The origin of replication may 
be one having a mutation which makes its functioning temperature-sensitive in the host celf 
(see, e.g., Ehrlich, 1978, Proceedings of the National Academy of Sciences USA 75:1433). 
The present invention also relates to recombinant host cells, comprising a nucleic 
5 acid sequence of the Invention, which are advantageously used in the recombinant 

production of the polypeptides. The term -host cell" encompasses any progeny of a parent 
cell which is not identical to the parent cell due to mutations that occur during replication. 

The cell is preferably transformed with a vector comprising a nucleic acid sequence 
of the invention followed by integration of the vector Into the host chromosome. 
10 Transformation" means introducing a vector comprising a nucleic acid sequence of the 

present invention into a host cell so that the vector fe maintained as a chromosomal integrant 
or as a self-replicating extra-chromosomal vector. Integration is generally considered to be 
an advantage as the nucleic acid sequence is more likely to be stably maintained in the cell. 
Integration of the vector Into the host chromosome may occur by homologous or non- 
15 homologous recombination as described above. 

The transformation of a bacterial host cell may. for instance, be effected by 
protoplast transformation (see T e.g., Chang and Cohen. 1979, Molecular General Genetics 
168:111*115), by using competent cells (see, e.g., Young and Sptzizin, 1961, Journal of 
Bacteriology 81 :823-829, or Dubnar and Davidoff-Abelson, 1 971 , Journal of Molecular 
20 Biology 56:209-221), by eieciroporafion {see, e.g.. Shigekawa and Dower, 1988, 

Biotechoiques 6:742-751), or by conjugation (see, e.g., Koehler and Thorns, 1987, Journal of 
Bacteriology 1 69:577 1 -5278). 

The transformed or transfected host cells described above are cultured in a suitable 
nutrient medium under conditions permitting the expression of the desired polypeptide, after 
25 which the resulting polypeptide is recovered from the cells, or the culture broth. 

The medium used to culture the cells may be any conventional medium suitable for 
growing the host cells, such as minimal or complex media containing appropriate 
supplements. Suitable media are available from commercial suppliers or may be prepared 
according to published recipes (e.g. in catalogues of the American Type Culture Collection). 
30 The media are prepared using procedures known in the art (see, e.g., references for bacteria 
and yeast; Bennett, J,W. and LaSure, L., editors. More Gene Manipulations in Fungi, 
Academic Press, CA. 1991). 

If the polypeptide is secreted into the nutrient medium, the polypeptide can be 
recovered directly from the medium, if the polypeptide is not secreted, it is recovered from 
35 cell lysates. The polypeptide are recovered from the culture medium by conventional 

procedures including separating the host cells from the medium by centrlfugation or filtration, 
precipitating the proteinaceous components of the supernatant or filtrate by means of a salt, 

e.g. ammonium sulphate, purification by a variety of chromatographic procedures, e.g. ion 

12 
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exchange chromatography, gelfiltration chromatography, affinity chromatography, or the like, 
dependent on the type of polypeptide in question. 

The polypeptides may be detected using methods known in the art that are specific 
for the polypeptides. These detection methods may include use of specific antibodies, 
5 formation of an enzyme product, or disappearance of an enzyme substrate. F or example, 
an enzyme assay may be used to determine the activity of the polypeptide. 

The polypeptides of the present invention may be purified by a variety of procedures 
known in the art including, but not limited to, chromatography (e.g., ion exchange, affinity, 
hydrophobic, chromatofocusing, and size exclusion), electrophoretic procedures (e.g., 
10 preparative isoelectric focusing (IEF), differential solubility (e.g., ammonium sulfate 

precipitation), or extraction {see, e.g.. Protein Purification, J.-C. Janson and Lars Ryden, 
editors, VCH Publishers, New York. 1969). 

DETAILED DESCRIPTION OF THE INVENTION 

.15 The first aspect of the invention relates to a bacterial host cell comprising at least 

two copies of an amplification unit in its genome, said amplification unit comprising; 
i} at least one copy of a gene of interest, and 

5) en expressible conditionally essential gene, wherein the conditionally essential gene 
is either promoterless or transcribed from a heterologous promoter having an activity 
20 substantially lower than the endogenous promoter of said conditionally essentia} 

gene, and 

wherein the conditionally essential gene if not functional would render the cell auxotrophic for 
at least one specific substance or unable to utilize one or more specific sole carbon source. 
The choice of a host cell will to a large extent depend upon the gene encoding the 

25 polypeptide and its source. The host cell may be a unicellular microorganism, e.g., a 

prokaryote, or a non-unicellular microorganism, e.g-. a eukaryote. U seful unicellular cells 
are bacterial cells such as gram positive bacteria including, but not limited to, a Baciflus cell, 
e.g., Baciflus alkalophilus % Bacillus amytoliquefaciens t Bacillus brevis, Bacillus circutans, 
Bacillus coagulans, Bacillus lautus. Bacillus l&ntus. Bacillus Jicheniformis. Baciflus 

30 megaterium. Bacillus stearothermophilus. Bacillus subtilis. and Bacillus, thuringiensis; or a 
Streptomyces cell, e.g., Streptomycas IMdans or Streptomyces murinus, or gram negative 
bacteria such as H. co// and Pseudomonas sp. I n a preferred embodiment, the bacterial 
host cell is a Bacillus tentus, Bacillus ticheniformis. Bacillus stearvthermophilus or Bacillus 
subtills cell. In one preferred embodiment, the bacterial host cell is a prokaryotic cell, 

35 preferably a a Gram-positive prokaryotic cell, and more preferably the bacterial Gram 

positive cetf is a species of the genus Bacillus, preferably selected from the group consisting 
of Bacillus alkalophifus, Baciflus amylQlrquefacians, Bacillus bravis, BacBlus drcutans, 
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Bacillus coagutans, Bacillus lautus, Bacillus lentus, Bacillus Uchenifortnis. Bacillus 
megaterium, Bacillus stearothermcpMlus, Bacillus subtiiis, and Bacillus thuringtensis. 

As described above, chromosomal integration of a vector or a smaller part of a 
vector, such as an amplification unit of the invention, into the genome of the host cell can be 
s achieved by a number of ways. A non-limittng example of integration by homologous 
recombination is shown herein. 

A preferred embodiment of the invention relates to the cells of the Invention, or the 
methods of the invention, wherein the amplification unit further comprises a nucleotide 
sequence with a homology to a chromosomal nucleotide sequence of the host cell sufficient 
10 to effect chromosomal integration in the host cell of the amplification unit by homologous 

recombination, preferably the amplification unit further comprises a nucleotide sequence of at 
least 100 bp. preferably 200 bp, more preferably 300 bp, even more preferably 400 bp, and 
most preferably at least 500 bp with an identity of at least 70%, preferably 80%, more 
preferably 90%, even more preferably 95%, and most preferably at least 98% identity to a 
15 chromosomal nucleotide sequence of the host cell. 

In a non-limiting example integration into the chromosome of a host cell can be 
selected for by first rendering a conditionally essential host cell gene non-functional as 
described elsewhere herein, thereby rendering the host cell selectable, then targetting the 
vector's integration by including on this a likewise non-functional copy of same host gene of a 
20 size that allows homologous recombination between the two different copies of the non- 
functional host genes in the genome of the host cell and on the integration vector, tailored so 
that such a recombination will restore a functional copy of the gene, thus leaving the host cell 
selectable. Or the vector may simply comprise a functional copy of the conditionally essential 
gene, to select for integration anywhere in the genome. 
25 A preferred embodiment of the invention relates to the cell of the invention, wherein 

a first amplification unit integrates into the host cell chromosome by homologous 
recombination with the partially deleted conditionally essential gene and renders the gene 
functional. 

A preferred embodiment of the invention relates to the cell of the invention, wherein 
30 the gene of interest encodes a polypeptide of interest, preferably the polypeptide is an 

enzyme such as a protease; a cellulase; a lipase; a xylanase; a phospholipase; or preferably 
an amylase. 

Another preferred embodiment of the invention relates to the cell of the invention, 

wherein the polypeptide is a hormone, a pro-hormone, a pne-pro-hormone, a small peptide, a 

35 receptor, or a neuropeptide. 

Still another preferred embodiment of the Invention relates to the cell of the 

invention, wherein the gene of interest encodes an enzyme, preferably an amylolytic 

enzyme, a lipolytic enzyme, a proteolytic enzyme, a cellulytic enzyme, an oxidoreductase or 

14 
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a plant cell-wall degrading enzyme, and more preferably an enzyme with an activity selected 
from the group consisting of aminopeptidase, amylase, amyloglucosidase, carbohydrase, 
carboxypeptidase, catalase, cetiutase, chitinase, cutinase, cydodextrin gfyoosyltransfefase, 
deoxyribonuclease, esterase, galactosidase, beta-galactosidase, glucoamytase. glucose 
s oxidase, glucosidase, haloperoxidase, hemicellulase, Invertase. isomerase, laccase, ligase, 
lipase, lyase, mannosidase, oxidase, pedlnase. peroxidase, phytase, phenoloxidase, 
polyphenoloxidase, protease, ribonuclease, transferase, transglutaminase, or xylanase. 

In a preferred embodiment, the invention relates to a cell, wherein the gene of 
interest encodes an antimicrobial peptide, preferably an anti-fungal peptide or an anti- 
10 bacterial peptide, or a peptide with biological activity in the human body, preferably a 

pharmaceutical^ active peptide, more preferably insulin/pro-msulinypre-pro-insuHn or variants 
thereof, growth hormone or variants thereof, or blood clotting factor VII or VIII or variants 
thereof. 

ConditionaJJy essentia) genes are well-characterized in the literature, for instance 

is genes that are required for a cell to synthesize one or more amino acids, where a non- 
functional gene encoding a polypeptide required for synthesis of an amino acid renders the 
cell auxotrophic for that amino add, and the cell can only grow if the amino acid is supplied 
to the growth medium. Restoration of the functionality of such a gene, or complementation by 
providing an exogenous functional copy of such a gene, allows the cell to synthesise the 

20 amino add on its own, and it becomes selectable against a background of auxotrophic cells. 

Consequently, a preferred embodiment of the invention relates to a cell of the first 
aspect, wherein the conditionally essential chromosomal gene(s) of the host cell encodes 
one or more polypeptide(s) involved in amino acid synthesis, and the non-functionality of the 
endogenous versions of the gene(s) renders the ceil auxotrophic for one or more amino 

25 ackJ(s), and wherein restoration of the functionality of the gene(s) renders the cell 
prototrophic for the amino add(s). 

Bacillus subtitis metB encodes a S-adenosyl-methionine synthetase, the metE/MetE 
sequences are available from EMBL:BS52812 (accession no. U52812) (Yocum,R.FV, 
Perkins,J.B.; Howitt.C.L; PeroJ.; 1996. Cloning and characterization of the mef£gene 

30 encoding S-adenosylmethionine synthetase from Bacillus subtslis. J. Bacterial. 
17B(15);4604). 

The feuB gene encodes 3-i$opropylmalate dehydrogenase, which catalyses the 
conversion of 3-carboxy-2-hydroxy-4-methylpentanoate to 3-carboxy-4-methyl-2- 
- oxopentanoate. A teaB-deficlent strain will be a leucine auxotroph. 
35 The lysA gene encoding diaminopimelate decarboxylase, which catalyses the 

* conversion of Meso~2,6-diamlnoheptanedioate to Uysine. A JysA-deficient strain will be a 
fysine auxotroph. 

15 
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A preferred embodiment relates to a cell of the invention, wherein the conditionally 
essential gene encodes an enzyme from the biosynthetic pathway of an amino add; 
preferably the conditionally essential gene encodes one or more polypeptide(s) involved In 
lysine, leucine or methionine synthesis, preferably the conditionally essential gene is 
5 homologous to the lysA, feuB, metC, or the meflEgene from Bacillus subtilis, and more 
preferably the conditionally essential gene is the lysA, leuB, metC, or metE gene from 
Bacillus licheniformis: more preferably the conditionally essential gene is at feast 75% 
Identical, preferably 85% identical, more preferably 95% and most preferably at least 97% 
identical to the lysA sequence of Bacillus licheniformis shown in SEQ ID NO:48 of WO 
10 02/00907 A1 , the leuB sequence of Bacillus HchBn/fbrmls, the metC sequence of Bacillus 
lichanifbrmis shown in SEQ ID NO:42 of WO 02/00907 A1, or the metE sequence of Bacillus 
subtilis shown in positions 997 to 2199 of SEQ ID NO: 16. 

The hemA gene encodes glutamyMRNA reductase, which catalyses the synthesis of 
5-amino leuvufinic add. A ftemA-deficient strain will have to be supplemented with 5-amino 
15 ieuvuiinic acid or haemin. 

In another embodiment, the conditionally essential gene encodes a glutamyl-tRNA 
reductase, preferably the conditionally essential gene is homologous to the hemA gene from 
Bacillus subtilis, and more preferably the conditionally essential gene is the hemA gene from 
Bacillus licheniformis; preferably the conditionally essential gene is at least 75% Identical, 
20 preferably 85% identical, more preferably 95% and most preferably at least 97% identical to 
the hemA sequence of Bacillus licheniformis. 

The conditionally essential gene(s) may encode polypeptides involved in the 
utilization of specific carbon sources such as xylose, glucanate, glycerol, or arabinose, in 
which case the host cell is unable to grow in a minimal medium supplemented with only that 
as specific carbon source when the gene(s) are non-functfonal- 

A preferred embodiment of the invention relates to a cell of the invention, wherein 
the at least one conditionally essential chromosomal gene(s) is one or more genes that are 
required for the host cell to grow on minimal medium supplemented with only one specific 
main carbon-source, 

30 A preferred embodiment relates to a cell of the invention, wherein the at least one 

conditionally essential gene encodes an enzyme required for xylose utilization, preferably the 
conditionally essential gene Is homologous to the xylA gene from Bacillus subtllis t and more 
preferably the conditionally essential gene is homologous to a gene of the xylose isomerase 
operon of Bacillus licheniformis^ most preferably to the xylA gene of Bacillus licheniformls; 

35 preferably the conditionally essential gene encodes a xylose isomerase and is at least 75% 
Identical, preferably 85% identical, more preferably 95% and most preferably at least 97% 
identical to the xylA gene of Bacillus licheniformls. 
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Another preferred embodiment relates to a cell of the invention, wherein the at least 
one conditionally essential gene encodes an enzyme required for gluconate utilization, 
preferably the conditionally essential gene encodes a gluconate kinase (EC 2.7.1 ,12) or a 
gluconate permease, more preferably the gene is homologous to the gntK gene or the gntP 
5 gene from Bacillus subffls. and most preferably the gene Is me gntK or gntP gene from 

Bacillus lichenifbrmis\ preferably the conditionally essential gene encodes a gluconate kinase 
(EC 2.7.1.12) or a gluconate permease or both and is at least 75% Identical, preferably 85% 
identical, more preferably 95% and most preferably at least 97% identical to any of the gntK 
and gniP sequences of Bacillus licheniformis. 
!0 Still another preferred embodiment relates to a cell of the invention, wherein the 

conditionally essential gene encodes an enzyme required for glycerol utilization, preferably 
the conditionally essential gene encodes a glycerol uptake facilitator (permease), a glycerol 
Kinase, or a glycerol dehydrogenase, more preferably the conditionally essential gene is 
homologous to the g/pP, g/pF, g/pK, or the glpD gene from Badtlus subtilis, and most 
15 preferably the conditionally essential gene comprises one or more of the glpP t g/pF, gtpK, 
and g/pD genes from Bacillus licheniformis shown in SEQ ID NO;26 of published PCT 
application WO 02/00907 A1 {Novozymes A/S) which is incorporated herein by reference in 
its totality; preferably the conditionally essential gene encodes a glycerol uptake facilitator 
(permease), a glycerol kinase, or a glycerol dehydrogenase, and is at least 75% identical, 
20 preferably 85% Identical, more preferably 95% and most preferably at (east 97% identical to 
any of the glpP, glpF, glpK and glpD sequences of Bacillus Uchentibrmis shown in SEQ ID 
NO:26 of WO 02/00907 A1 . 

One more preferred embodiment relates to a cell of the invention, wherein the 
conditionally essential gene encodes an enzyme required for arabinose utilization, preferably 
25 an arabinose isomerase, more preferably the gene is homologous to the araA gene from 
Bacillus subtiiis. and most preferably the gene is the a/aA gene from Bacillus licheniformis 
shown in SEQ ID NO:38 of WO 02/00907 A1; preferably the conditionally essential gene 
encodes an arabinose isomerase, and Is at least 75% identical, preferably 85% identical, 
more preferably 95% and most preferably at least 97% identical to the araA sequence of 
30 Bacillus licheniformis shown in SEQ ID NO:38 of WO 02/00907 A1 . 

The amplification unit in the cell of the invention may also include an antibiotic 
marker gene. However, as it Is preferred not to have marker genes in the chromosome, an 
alternative way of removing the marker gene must be employed. Specific restriction enzymes 
denoted resohrasss excise portions of DNA If each portion is flanked on both sides by certain 
35 recognition sequences known as resolvase sites or res-sites; these resolvase enzymes are 
* well-known in the art, see e.g. WO 96/23073 (Novo Nordisk A/S) which is included herein by 
reference* 
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A preferred embodiment relates to a cell of the Invention, wherein the amplification 
unit further comprises an antibiotic selection marker, preferably the selection marker is 
flanked by resolvase sites or res-sites. 

Subsequent to the action of the resolvase enzyme, the antibiotic restriction marker 
5 flanked by res-sites will have been excised from the chromosome of the cell, leaving only 
one copy of the res-site behind as testimony to the procedure. 

Accordingly, a preferred embodiment relates to a cell of the invention, wherein the 
amplication unit further comprises a resofvase site or res-site. 

As the present invention refies on a reduced transcription of the conditionally 
10 essentia) gene comprised in the amplification unit as compared to lis wild-type transcription 
level, it may be an advantage to include one or more transcription terminators upstream of 
the gene in different reading frames, in order to avoid any unintentional read-through 
transcription from a gene further upstream in the chromosome from where the unit was 
integrated. 

15 A preferred embodiment relates to a cell of the invention, wherein the conditionally 

essential gene comprised m the amplification unit has at least one transcription terminator 
located upstre&m of the gene. 

Another way of reducing transcription of the conditionally essential gene is to 
express it from a heterologous or completely artificial promoter, which has a reduced activity 

20 as compared to the wild-type or endogenous promoter normally transcribing said gene. 
Preferably, the conditionally essential gene is transcribed from a heterologous promoter 
having an activity level, when compared with the endogenous promoter of the conditionally 
essential gene, which is reduced with a factor of 2, preferably 5, more preferably 10, even 
more preferably 50, and most preferably with a factor of 100. 

25 Still another strategy could be to have a promoteriess conditionally essential gene in 

the amplification unit, and then simply rely on what read-through transcription there might 
from any other gene(s) located upstream of the conditionally essential gene, before or after 
integration of the unit into the chromosome of the cell of the invention. Preferably, the 
conditionally essential gene is promoteriess; and more preferably the gene of interest is 

30 located upstream of the conditionally essential gene in the amplification unit, so that the two 
genes are co-directionally transcribed, whereby the conditionally essential gene is expressed 
by read-through transcription from the gene of interest 

A second aspect of the invention relates to a method for producing a protein 
encoded by a gene of interest, comprising 

35 a) culturing a bacterial host cell comprising at least two duphcated copies of an 
amplification unit in its genome, the amplification unit comprising: 
i) at least one copy of the gene of Interest, and 
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ii) an expressible conditionally essential gene, wherein the conditionally essential 
gene is either promotertess or transcribed from a heterologous promoter having 
an activity substantially lower than the endogenous promoter of said 
conditionally essential gene, 
5 wherein the conditionally essential gene If not functional would render the cell auxotrophic 
for at least one specific substance or unable to utilize one or more specific sole carbon 
source; and 

b) recovering the protein. 

As already mentioned, any cell of the invention is envisioned to be suitable in the 
10 methods of the second aspect, in particular the preferred embodiments outlined in the above* 
A final aspect of the invention relates to methods for producing a bacterial cell 
comprising two or more amplified chromosomal copies of a gene of interest, the method 
comprising: 

a) providing a bacterial cell comprising at least one copy of an amplification unit, the unit 
is comprising: 

i) at least one copy of the gene of interest, and 

ii) an expressible functional copy of a conditionally essential gene, which is either 
promoterless or transcribed from a heterologous promoter having an activity 
substantially Jower than the endogenous promoter of said conditionally essential 

20 gene, 

wherein the conditionally essential gene if not functional would render the cell auxotrophic 
for at least one specific substance or unable to utilize one or more specific sole carbon 
source; 

c) cultivating the cell under conditions suitable for growth in a medium deficient of said at 
25 least one specific substance and/or with said one or more specific sole carbon source, 
thereby providing a growth advantage to a cell in which the amplification unit has been 
duplicated in the chromosome; and 
c) selecting a cell wherein the amplification unit has been duplicated in the chromosome, 
whereby two or more amplified chromosomal copies of the gene of interest were 
30 produced. 

Again, as already mentioned, the methods of the final aspect of the invention are 
envisioned as being suitable for producing any cell of the invention, in particular the preferred 
embodiments of said cell that are outlined in the above. 

35 EXAMPLES 

- Strains and Donor Organisms 

Bacillus subtilti PL1 801. This strain is the B.subtilis DN18S5 with disrupted apr and 

npr genes (Diderichsen, B_. Wedsted, U., Hedegaard, L, Jensen, 8- R M Sjsholm, C. (1990) 
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Cloning of a/dB. which encodes alpha-acetolactate decarboxylase, an exoenzyme from 
Bacillus brevis. J. Bacterid., 172, 4315-4321). 

B.subtiiis CL046. This strain is a ft subtifis PL1601 where the metE gene is deleted 
and replaced with Ihe kanamycine (kan) resistance gene from pUS1 10 by use of the plasmid 
5 pCL043. 

Bsubtilis CL049. This strain is the CL046 strain where the kanamycine resistance 
gene is deleted. 

Competent cells were prepared and transformed as described by Yasbin. R.E., 
Wilson. G.A. and Young, F.E. (1975) Transformation and transaction in lysogenic strains of 
10 Bacillus subtffls: evidence for selective induction of prophage in competent cells. J. Bacterio), 
121:296-304. 

Plasmids 
OCLQ43: 

IS This ptasmld Is a pBR322 derivative (Watson, N., 1988 Gene 70(2):399-403) 

essentially containing elements making the plasmid propagatabte In E. colL a ampicillin 
resistance gene, a gene conferring resistance to kanamycine, two flanking fragments from B. 
subtifis rneiE inserted upstream and downstream of the kanamycine resistance gene, two 
direct repeats that signify the res site from pAMBetal (Janniere, L. 1996, Nucleic Acids Res. 

20 24(17):3431-3436. This plasmid is used for deleting the metE gene in" the B. subtitis strain 
PL1801. 



Table 1. Plasmid pCLD43, 7311 bp 



Position (bp) 


Size (bp) 


Element (bp) 


Origin 


1-973 


973 


Upstream metE seq. 


6. subtifis 


974-1010 


37 


Linker 


Synthetic 


10111-1184 


174 


res site from pAMbetal 


E. faeoalis 


1185-1190 


6 


Linker 


Synthetic 


1191-2159 


969 


pUB110 (Kan gene) 


S. aureus 


2160-2162 


3 


Linker 


Synthetic 


2163-2336 


174 


res site from pAMpI 


E. faecalis 


2337-2357 


21 


Linker 


Synthetic 


2358-3870 


1513 


Downstream metE seq. 


B. subtilis 


3871-7311 


3441 


pBR322 


E.CO/J 
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PCU01154 

This plasmid is a pBR322 derivative (Watson, N., 1988 Gene 70(2>:399-403) 
containing elements making the plasmid propagatable In £. co//\ The plasmid codes for the 

s ampicillin resistance gene, the kanamycine resistance gene, the chloramphenicol resistance 
gene and the latiZ gene from E. co!L The gfp gene from A victoria and the metE gene from 
fl. subtihs are transcriptionally fused in the plasmid controlled by a promoter that can be ex- 
changed with other promoters. This plasmid is used for integration and amplification studies 
in the amyE locus of CL049. The primers for metE fragment PCR amplifications on 

10 chromosomal DMA. Isolated from PL1 801 are as follows: 

P52 (SEQ ID NO: 1): a ataataaagatctggaggagaaacaatgacaacc 
P53 (SEQ ID NO: 2): a aataataagatcteaattatactagctgtgtc 

15 Table 2. Plasmid pCL01154, 13135 bp. 



Position (bp) 


Size (bp) 


Element (bp) 


Origin 


1-539 


539 


Upstream amyE 


S. SUuUilS 




OKA 




B subtiiis 


2854-2891 


38 


Linker 


Synthetic 


2892-3605 


714 


gfp gene 


A. Victoria 


3606-3739 


134 


Promoter - air 


B. subtitis 


3740-3785 


46 


Linker 


Synthetic 


3786-4821 


1036 


pC1 94 (cat gene) 


S. aureus 


4622-5008 


187 


part of fefCgene 


£ cofi 


5009-5106 


96 


Promoter 


Synthetic 


5107-5111 


6 


Linker 


Synthetic 


5112-8224 


3113 


spoVG-IacZ fusion 


B. subtitis & E. cofi 


8226-8314 


89 


part of tefCgene 


£ coli 


8315-9657 


1343 


Downstream amyE 


B. SUbtilis 


9658-9845 


168 


Linker 


Synthetic 


9846-11117 


1272 


pUB1 10 {neo gene) 


S. aureus 


11118-11184 


67 


Linker 


Synthetic 


11185-11277 


93 


Tn5 fragment 


E.coU 
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11278-11281 


4 


Linker 


Synthetic 


11282-13119 


1838 


PBR322 (bla gene) 


E.co/7 


13120-13129 


10 


Linker 


Synthetic 



Propagation of PL1801 strain for LacZ activity determination 

The B. subtilis strain PL1801 was propagated in liquid medium TV. After 10 
generations of incubation at 37"C and 300 rpm, the cells were harvested, and cells were dis- 
5 rupted by sonic or lysozyme treatment 



General molecular biology methods 

Unless otherwise mentioned the DNA manipulations and transformations were 
performed using standard methods of molecular biology (Sambrook et al. (1989) Molecular 
10 cloning: A laboratory manual Cold Spring Harbor lab.. Cold Spring Harbor, NY; Ausubel, F. 
M. et al. (eds.) "Current protocols in Molecular Biology*. John Wiley and Sons, 1995; 
Harwood, C. FL. and Cutting, S. M- (eds.) "Molecular Biological Methods for Bacillus". John 
Wiley and Sons, 1990). 

Enzymes for DNA manipulations were used according to the specifications of the 
15 suppliers (e.g. restriction endonucleases, ligases etc. are obtainable from New England Bio- 
labs, Inc.). 



Media 

TY: (as described in Ausubel. F. M. et al. (eds.) "Current protocols In Molecular Biology*. 
20 John Wiley and Sons, 1995). LB agar (as described in Ausubel, F. M, et al. (eds.) "Current 
protocols in Molecular Biology". John Wiley and Sons, 1995). 

Minimal TSS agar: As described in Fouet A. and Sonenshein, A. L. (1990) A Target for 
Carbon Source-Dependant Negative Regulation of the cltB Promoter of Bacillus su Wilis. J. 
Bacterid., 172, 835-844. For plates, 2% agar was added and for methionine auxotropy 
25 determination the plates were supplemented vwth 50 microg/ml methionine. 



Assay for beta-galactosidase activity 

Beta-galactosidase activity was determined by a method using ortho-nrtrophenyl- 
beta-O-galactopyranoside as enzymatic substrate. Under a specified set of conditions (temp.. 
30. pH, reaction time, buffer conditions) a given amount of beta-galactosldase will degrade a cer- 
tain amount of substrate and a yellow colour will be produced. The colour intensity is meas- 
ured at 420 nm. The measured absorbance is directly proportional to the activity of the beta- 
galactosldase in question under a given set of conditions. 



22 



31. OCT. 2003 16:09 MVOZVMES PATENTS 

10442.O0D-DK 



NO. 3565 P. 25 



Deletion of metE in B- subtitis 

A plasmid \pas constructed for the purpose of deleting the metE gene in & subtitis. 
Two flanking sequences upstream and downstream of the galE gene were amplified by PGR 
5 and fused by PGR on each side of a kanamycine (Kana) marker. This fragment was Ugated 
in plasmid pBR322. 

Upstream metE fragment: 

P42 (SEQ ID NO: 3): attttataggalcccgctgaticattttcttc^cgaac 
10 P43 (SEQ ID NO: 4): gaattccatcgcactggacgacatttlraggtcgattctcggaaatcc 

Downstream metE fragment: 

P44 (SEQ ID NO: 5): cccgaggcctttcaggcccgcaaacaatatggttgaagccgraaaacagg 
P45 (SEQ ID NO: 6): ataataatggtaccatattgatgtgacacttgaagttgc 

15 

The resulting plasmid pCL043 (SEQ ID NO: 7) was linearised and transferred to B. 
subtilis PL1801 and plated on LBPG media with 10 yg/ml kanamyctne, which left the Kan 
marker in place of the metE gene. 

A metE deletion strain designated CL046 was tested on minimal media without 

20 mehionine. The original B. subtitis PL1801 (me*E*) strain showed fine growth on these plates 
while the metS strain CL046 showed no growth even after several days of incubation. On 
control minimal plates supplemented with 50 pg/ml methionine, both strains grew. The 
reported auxotrophic phenotype on a mefE* strain is therefore confirmed. 

The Kan marker located in the metE locus of CL046 was flanked by resolvase 

25 recognition sites (r&s), which allow a specific excision reaction in the presence of a 
resolvase. In order to remove the Kan marker from the chromosome, CL046 was 
transformed with pWT, which is a temperature sensitive plasmid that comprises a gene 
coding for resolvase and an erythromycine (Erm) resistance marker. Transformants were 
selected on plates with 5microg/ml Erm. They were tested for loss of the Kan marker and 

30 further re-streaked twice on plates with no antibiotics at 50°C to cure the strains of the pWT 
plasmid. Selected clones were screened for loss of Enm resistance and Kan resistance and 
were designated CL049 (PL1801, mefE"; no antibiotic markers). 

Amplification plasmids 

35. An amplification plasmid was made having a transcriptional unit concisting of the gfp 

gene and the metE gene with a cloning site in front of the two genes, wherein a promoter 
could be cloned (pCL01154, SEQ ID NO: 8). The /acZ reporter gene was also present on the 
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plasmid expressed from a promoter separate from the promoter In front of the metE gene. 
Flanking these two transcriptional units was framgments from the amyE locus in EL subtiiis. 

Promoters with varying promoter activity were doned In front of the gfp-metE 
transcriptional unit in the EcoR\ and Hmd\\\ sites. The promoter activities spanned from 30 to 
5 S19 arbitrary units. See table 3. 



Promoter 


Activity /Units 


Sequence 


Pr30 


30 


(SEQ (D NO: 9) 


Pr43 


43 


(SEQ ID NO: 10) 


Pt119 


119 


(SEQ ID NO: 11) 


PM 64 


164 


(SEQ ID NO: 12) 


Pr342 


342 


(SEQ ID NO: 13) 


Pr409 


409 


(SEQ ID NO: 14) 


Pr519 


519 


(SEQ ID NO: 15) 



Table 3: The table shows the promoters used in the amplification experiment and the 
sequenoe is given. 

to 

Amplification experiments 

The resulting amplification plasmids were introduced by transformation into CL049 
(metB) and plated on so\id LB media supplemented with 6 mlcrog/ml chloramphenicol. 
Transformants were screened for resistance to kanamycine. 
15 Transformants being sensitive to kanamycine would have integrated part of the 

amplification plasmid at the amyE locus including the /acZ reporter gene and the gfp-rnet£ 
operon. Those transformans would have only one copy of the genes present and they cannot 
be amplified. 

Transformants being resistant to kanamycine would have the whole amplification 
20 plasmid integrated at the amyE locus and amplification would be possible. 

Both types of transformants were plated on solid minimal TBS media without 
methionine. Several colonies were obtained from the transformants having the whole plasmid 
integrated at the amyE locus, whereas the transformants that had only part of the plasmid 
integrated showed no growth on minimal medium. This indicated that even with the strongest 
25 promoter, one copy of the mefE gene did not express sufficient MetE protein to complement 
the methionine auxotrophy of the strain. However, amplification of the metE gene did result in 
growth of the strain. 

Colonies were picked from the amplification step a long with colonies that had only 
one copy of the metE gene integrated in the chromosome. They were all grown in liquid LB 
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and harvested in the exponential growth phase followed by measurement of U-galactosldase 
activity. The following table gives the results from the evaluation of the amplification 
outcomes. 

A few clones show irregular enzyme activities, which can be explained by up- 
5 mutations in the promoters. 



Promoter 
Strength 


Strain 


Units 


Copies 




1 gene copy 


105 


1.0 


30 


Amplification 


1361 


1?.4 




Amplification 


218 


2,0 




1 pene copy 


101 


0.9 


43 


Amplification 


1457 


13.4 




Amplification 


1460 


13,3 




li^ene °°PY 


113 


1.0 


119 


Amplification 


1055 


9.6 




Amplification 


1075 


9.8 i 




1 qene copy 


102 


0.9 


164 




881 


B.0 




Amplification 


IB55 


7.8 


342 


1 qene copy 


134 


1,2 


Amplification 


606 


5.5 




1 gene copy 


105 


1,0 


409 


Amplification 


533 


4.9 




Amplification 


493 


4.5 




1 gene copy 


105 


1.0 


519 


Amplification 


544 


S.O 




Amplification 


1 114 


1.0 



Table 4: The table shows the results from the amplification trials and the £-galactosidase 
activity measured in all strains after growth in LB lipuid media. TVie enzyme activities have 
been converted to the gene copy number of the reporter gene based on the enzyme 
10 activities. 

The results summarized herein show that it is indeed possible to increase the copy 
number of a chromosomally integrated expression cassette holding a weakly expressed 
rnetE gene by growing the strain on minimal medium without methionine The amplification 
potential >10 copies (up to 25 copies have been observed), as judged from the enzyme 
15 activities is very similar to what can be achieved by the traditional kaoamyrine antibiotic 
selection/amplification. 
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CLAIMS 

1 . A bacterial host cell comprising at least two copies of an amplification unit in its 
genome, said amplification unit comprising: 

i) at least one copy of a gene of interest, and 
5 ii) an expressible conditionally essential gene, wherein the conditionally essential gene 

is either promotertess or transcribed from a heterologous promoter having an activity 
substantially lower than the endogenous promoter of said conditionally essential 
gene, and 

wherein me conditionally essential gene if not functional would render the cell auxotrophic for 
10 at least one specific substance or unable to utilize one or more specific sole carbon source. 

2. The cell of claim 1 , wherein the bacteria! cell is a prokaryotic cell. 

3 The ceii of c i a im 2, wherein the bacterial prokaryotic cell is a Gram-positive cell. 

15 

4. The cell of claim 3 r wherein the bacterial Gram positive cell is a species of the genus 
Bacillus* preferably selected from the group consisting of Bacillus aikalophitus, Bacillus 
amytoliquafaciens. Bacillus brevis, Bacillus drculans, Bacillus coagulans. Bacillus tautus, 
Bacillus lenius, Bacillus lichenifbrmis, Bacillus megaterium, BacWus sfearoihermophitus, 

20 Bacillus subtilis. and Bacillus thuringiensis. 

5. The cell of any of claims 1-4, wherein the gene of interest encodes an enzyme, 
preferably an amylolytic enzyme, a lipolytic enzyme, a proteolytic enzyme, a cellulytic 
enzyme, an oxidoreductase or a plant cell-wall degrading enzyme, and more preferably an 

25 enzyme with an activity selected from the group consisting of aminopeptidase, amylase. 

amyloglucosidase, carbohyrfrase. carfcoxypeptidase, catalase. cellulase, chitinase, cutinase, 
cyclodextrin glycosyltransferase, deoxyribonuclease, esterase, galactosidase. beta- 
gaiactosidase. glucoamylase, glucose oxidase, glucosidase, haloperoxkJase, hemicellulase. 
invertase. isomerase, laccase. ligase, lipase, lyase, mannosidase. oxidase, pectinase, 

30 peroxidase, phytase, phenotoxidasej polyphenoloxidase, protease, ribonuclease, 
transferase, transglutaminase, or yylanase. 

6. The cell of any of claims 1-4, wherein the gene of interest encodes an antimicrobial 
peptide, preferably an anti-Tunga! peptide or an anti-bacterial peptide. 

35 

- 7. The cell of any of claims 1-4. wherein the gene of interest encodes a peptide with 
biological activity in the human body, preferably a pharmaceutical^ active peptide, more 
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preferably insulin/pro-insulin/pre-pro-lnsulin or variants thereof, growth hormone or variants 
thereof, or blood clotting factor VII or Vlll or variants thereof. 

8. The cell of any of claims 1-7, wherein the conditionally essential gene encodes an 
5 enzyme from the biosynthetic pathway of an amino add, 

9. The cell of claim 8, wherein the conditionally essential gene encodes one or more 
polypeptide^) Involved in lysine, leucine or methionine synthesis, preferably the conditionally 
essential gene is homologous to the tysA, teuB t metC. or the met£ gene from Bacillus 

10 subtffls, and more preferably the conditionally essential gene is the ty$A, leuB, metC, or rnetE 
gene from Bacillus licheniformis. 

10. The cell of claim 8, wherein the conditionally essential gene is at least 75% identical, 
preferably 85% identical, more preferably 95% and most preferably at least 97% identical to 

15 the lysA sequence of Bacillus lichenlformis shown in SEQ ID NO:48 of WO 02/00907 A1 , the 
leuB sequence of Bacillus licheniformis, the metC sequenoe of Bacillus licheniformls shown 
in SEQ ID NO:42 of WO 02/00907 A1 , or the metE sequence of Bacillus subtills shown in 
positions 997 to 2199 of SEQ ID NO:16. 

20 11. The cell of any of daims 1-7, wherein the conditionally essential gene encodes a 
glutamyf-tRNA reductase, preferably the conditionally essential gene is homologous to the 
hemA gene from Bacillus subMs. and more preferably the conditionally essentia? gene is the 
hemA gene from Bacillus licheniformis* 

25 12. The cell of any of claims 1-7. wherein the conditionally essential gene is at least 
75% identical, preferably 85% identical, more preferably 95% and most preferably at least 
97% identical to the hemA sequence of Bacillus ticheniformis. 

13. The cell of any of claims 1-7, wherein the conditionally essential gene encodes an 
30 enzyme required for xylose utilization, preferably the conditionally essential gene is 

homologous to the xylA gene from Bacillus subtilis, and more preferably the conditionally 
essential gene is homologous to a gene of the xylose isomerase operon of Bacillus 
iicheniformis, most preferably to the xylA gene of Bacillus lichenifbrmis. 

35 14. The cell of any of claims 1 -7. wherein the conditionally essential gene encodes a 
xylose isomerase and is at least 75% identical, preferably 85% identical, more preferably 
95% and most preferably at least 97% identical to the xylA gene of BacWus ticheniformis. 
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1 5. The cell of any of claims 1 -7, wherein the conditionally essential gene encodes an 
enzyme required for gluconate utilization, preferably the conditionally essential gene encodes 
a gluconate kinase (EC 2.7.1.12) or a gluconate permease, more preferably the gene is 
homologous to the gn*K gene or the gnfPgene from Bacillus subtifis, and most preferably the 

5 gene is the gntK or gntP gene from Bacillus licheniformis. 

16. The cell of any of claims 1-7, wherein the conditionally essential gene encodes a 
gluconate kinase {EC 2.7.1 .12) or a gluconate permease or both and is at least 75% 
identical, preferably B5% identical, more preferably 95% and most preferably at least 97% 

10 identical to any of the gntK and gnfP sequences of Bacillus licheniformis. 

17. The cell of any of claims 1-7. wherein the conditionally essential gene encodes an 
enzyme required for glycerol utilization, preferably the conditionally essential gene encodes a 
glycerol uptake facilitator (permease), a glycerol kinase, or a glycerol dehydrogenase, more 

15 preferably the conditionally essential gene is homologous to the glpP, glpF t glpK, or the glpD 
gene from Bacillus subtilis, and most preferably the conditionally essential gene comprises 
one or more of the glpP, QlpF. glpK and glpD genes from Bacillus licheniformis shown in 
SEQ ID NO:26 of WO 02/00907 A1, 

20 18. The cell of any of claims 1-7, wherein the conditionally essential gene encodes a 
glycerol uptake facilitator (permease), a glycerol kinase, or a glycerol dehydrogenase, and is 
at least 75% identical, preferably 85% identical, more preferably 95% and most preferably at 
least 97% identical to any of the g/pP, g/pF, g/pK, and glpD sequences of Baallus 
licheniformis shown in SEQ ID NO:26 of WO 02/00907 A1. 

25 

19. The ceR of any of claims 1-7. wherein the conditionally essential gene encodes an 
enzyme required for arablnose utilization, preferably an arablnose isomerase, more 
preferably the gene is homologous to the araA gene from Bacillus subtffls, and most 
preferably the gene Is the araA gene from Bacillus licheniformis shown in SEQ ID NO:38 of 

30 WO 02/00907 A1. 

20. The cell of any of claims 1-7, wherein the conditionally essential gene encodes an 
arablnose isomerase, and is at least 75% identical, preferably 85% identical, more preferably 
95% and most preferably at least 97% identical to the araA sequence of Bacillus 

35 licheniformis shown In SEQ ID NO:38 of WO 02/00907 A1 . 
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21 . The cell of any of claims 1 -20, wherein the amplification unit further comprises an 
antibiotic selection maTker, preferably the selection marker is flanked by resolvase sites or 
res-sites. 

5 22. The cell of any of claims 1 -20, wherein the amplification unit further comprises a 
resolvase site or ms-site. 

23. The cell of any of daims 1-22, wherein the conditionally essential gene comprised In 
the amplification unit has at least one transcription terminator located upstream of the gene- 

10 

24. The cell of any of claims 1-23. wherein the conditionally essential gene is 
transcribed from a heterologous promoter having an activity level, when compared with the 
endogenous promoter of the conditionally essential gene, which is reduced with a factor of 2, 
preferably 5, more preferably 10, even more preferably SO, and most preferably with a factor 

15 of 100. 

25. The cell of any of claims 1-23. wherein the conditionally essential gene Is 
promoterless. 

20 26. The cell of claim 25, wherein the gene of Interest Is located upstream of the 
conditionally essential gene in the amplification unit, and wherein the two genes are co- 
directionally transcribed. 

27, The cell of claim 26, wherein the conditionally essential gene is expressed by read- 
25 through transcription from the gene of interest. 

28. A method for producing a protein encoded by a gene of interest, comprising 

a) c ulturtng a bacterial host cell comprising at least two duplicated copies of an 
amplification unit in its genome, the amplification unit comprising: 

30 i) at least one copy of the gene of interest, and 

it) an expressible conditionally essentia! gene, wherein the conditionally essential 
gene is either pnomoterless or transcribed from a heterologous promoter having 
an activity substantially lower than the endogenous promoter of said 
conditionally essential gene. 
35 wherein the conditionally essential gene if not functional would render the cell auxotrophic 
for at least one specific substance or unable to utilize one or more specific sole carbon 
source; and 

b) recovering the protein. 
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29. A method for producing a bacterial ceR comprising two or more amplified 
chromosomal copies of a gene of interest, the method comprising: 
a) providing a bacterial cell comprising at least one copy of an amplification unit, the unit 
5 comprising: 

i) at least one copy of the gene of interest, and 

ii) an expressible functional copy of a conditionally essential gene, which is either 
promoterless or transcribed from a heterologous promoter having an activity 
substantially lower than the endogenous promoter of said conditionally essentia! 

io gene, 

wherein the conditionally essential gene if not functional would render the cell auxotrophic 
for at least one specific substance or unable to utilize one or more specific sole carbon 
source; 

c) cultivating the cell under conditions suitable for growth in a medium deficient of said at 
15 least one specific substance and/or with said one or more specific sole carbon source, 
thereby providing a growth advantage to a cell in which the amplification unit has been 
duplicated in the chromosome; and 
c) selecting a cell wherein the amplification unit has been duplicated in the chromosome, 
whereby two or more amplified chromosomal copies of the gene of interest were 
20 produced. 
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TITLE: Method for stable gene-amplification in a bacterial host cell 
ABSTRACT 

A bacterial host celt comprising at (east two copies of an amplification unit in its 
5 genome, said amplification unit comprising: I) at least one copy of a gene of interest, and it) 
an expressible conditionally essential gene, wherein the conditionally essential gene Is either 
promoterless or transcribed from a heterologous promoter having an activity substantially 
tower than the endogenous promoter of said conditionally essential gene, and wherein the 
conditionally essential gene if not functional would render the cell auxotrophic for at feast one 
10 specific substance or unable to utilize one or more specific sole carbon source; methods for 
producing a protein using the cell of the invention, and methods for constructing the cell of 
the Invention. 
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Ol-SQ Lisring-31 Oct 20Q3 .ST25 ,txt 

SEQUENCE LISTING 

<UD> Novozymes A/S 

<120> metE gene-amplification 

<130> 10442 . OOO-DK 

<160> 17 

<170> patent in version 3.2 

<210> 1 
<211> 35 

<212> DNA 

<213> artificial sequence 
<220> 

*223> primer P52 

<400> X ,r 
aataataaag axctggagga gaaacaatga caacc * 3 

<210> 2 

<211> 33 

<212> DNA 

<213> artificial sequence 
<220> 

<223> Primer P53 

<400> 2 « 
aaataataag atctaaatta tactagctgt gtc » 

<210> 3 

<211> 39 

<212> DNA 

<213> artificial sequence 
<220> 

<223> Primer P42 

<400> 3 , Q 
attttatagg arcccgctga ttcattttct tctgcgaac « 

<210> A 

<2U> 4B 

<212> DMA 

<213> artificial sequence 
<220> 

<223> Primer p43 

<400> 4 - fl 
gaattccatc gcactggacg acattttcag gtcgattctc ggaaatcc 

<210> 5 

<211> 50 

<212> DNA 

<213> artificial sequence 
4220> 

<223> primer P44 
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Ol-SQ Lisn'ng-31 oct 2003-ST25.txt 

<4O0> 5 

cccgaggcct trtcaggcccg caaacaaxax ggxxgaagcc gcaaaaeagg 50 

<:210> 6 

<211> 39 

<212^ DNA 

<213> artificial sequence 
<220> 

<213> Pnroer MS 

<400> 6 %q 
ataaraatgg taccataxxg atgxgaeacx tgaagtrgc « 

<210> 7 

<211> 7311 

<212> DNA , „ 

<2I3> artificial sequence 

<220> 

<223> Plasmid pCL043 

<400> 7 crk 

cxxg9agggc caagcgatgt gccagagcta aaagaagcag tgaaaaacgc agtgaaaaac &d 

ggagxgcxtg rcgttxgxgc agcgggaaax gaaggtgacg gcgacgaacg caeagaagag 120 

cxtxcctacc ccgcagctta taatgaagtg attgcagttg gaxetg«xc tgcagcgcga 180 

gaattatcag aatxxtctaa cgcgaaxaaa gagatxgacc ttgtggcacc aggagaaaac 240 

atcxtaxcca cccrtcccaa caagaagxac ggtaagctga ccggcacttc aatggcxgcc 300 

cctcatgtca gcggtgcgct tgctttaatc aaaagctatg aagaagaatc atntcaaaga 360 

aagcrttctg aaxctgaggt xrtcgcacag ctaatccgca ggacacttcc tctxgataxx 420 

gcaaaaacgc tggcaggcaa nggattcctg taxttaacag ctcctgatga gctcgcagaa 460 

aaagcagagc aatcacattt gttgacccta taagattatt tttcttatat aatatacacc 540 

acatcatgta aataaaaaxx tcaaattcta xgxtgacaat gaatttgaat tacxgttaag 

attaccaaca aatgattcaa cttttcaaaa aattaaxaac atxrtctctt atcgagagtt 

gggcgaggga ttggcctttt gaccccaaca gcaaccgacc gtaaxaccat xgtgaaatgg 720 

ggcgcactgc ttttcgcgcc gagactgatg tctcataagg cacggtgcta axtccatcag 780 

attgtgxctg agagatgaga gaggcagtgt txxacgtaga aaagccxctt tctctcargg 840 

gaaagaggct txxxgttgtg agaaaacctc ttagcagcct gtatccgcgg gtgaaagaga 900 

gtgxcttaca rataaaggag gagaaacaat gacaaccatc aaaacatcga axxtaggatt 960 

tccgagaatc gacctgaaaa tgtcgtccag tgcgatggaa txctgatcaa atggxtcagx 1020 

gagagcgaag cgaacacttg atttrttaat tttctatctx ttataggtca ttagagtata 1080 

cttatttgTc ctataaacta tttagcagca raatagartt attgaatagg tcattraagt 1140 

tgagcataxx agaggaggaa aatcttggag aaaxatttga agaacccgaa cgcgtgagta 1200 

gttcaacaaa cgggccagtt tgxtgaagax tagaxgctat aaxtgtxatt aaaaggatxg 1260 

aaggatgctt aggaagacga gtraxxaata gcxgaaxaag aacggtgctc tccaaatact 1320 

page 2 
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Ol-SQ Listing-31 Oct 2003.ST2S.txt 

cttatxtaga aaagcaaaxc xaaaattatc tgaaaaggga atgagaatag xgaatggacc 1380 

aaxaaxaatg acxagagaag aaagaatgaa gaxxgttcax gaaattaagg aacgaaxaxx 1440 

ggataaatat ggggatgatg xxaaggctax tggxgtttat ggctctcxxg gtcgxcagac 1500 

xgaxgggccc xaxtcggaxa txgagatgax gxgtgtcatg ncaacagagg aagcagagtt: 1560 

cagccatgaa tggacaaccg gtgagxggaa ggtggaagtg aattttgata gcgaagagat 1620 

xcxactagax taxgcaxctc aggtggaaxc agaxxggccg cxtacacatg gtcaattttt 1680 

ctctaxxxtg ccgatttatg attcaggtgg axacttagag aaagtgtatc aaacxgcxaa 1740 

atcggxagaa gcccaaacgx xccacgatgc gatttgtgcc cxtatcgtag aagagctgXX 1800 

xgaatatgca ggcaaatggc gxaataxxcg tgtgcaagga ccgacaacat ttctaccatc 1860 

cttgactgta caggtagcaa tggcaggtgc catgttgatt ggxctgcatc atcgcatctg 1920 

xtaxacgacg agcgcttcgg xctxaactga agcagttaag caatcagatc xxccttcagg 1980 

xxatgaccat ctgxgccagx xcgtaaxgtc tggtcaacxx Xccgacxctg agaaacttcx 2040 

ggaatcgcxa gagaattxct ggaatgggax tcaggagtgg acagaacgac acggaratat 2100 

agtggatgtg xcaaaacgca taccarcttg aacgatgacc tctaaxaatt gttaaxcatg 2160 

xtggagcxca gtgagagcga agcgaacact xgatttttxa attttcxatc xxxtaxaggx 2220 

caxtagagxa tactxatttg xcctaxaaac tatxtagcag cataatagat ttaxtgaata 2280 

ggxcatttaa gttgagcata ttagaggagg aaaatcxtgg agaaatatxx gaagaacccg 2340 

aggccxrtca ggcccgcaaa caaxaxggtt gaagccgcaa aacaggcaag agcacagcag 2400 

acacagcxag tataaxttga aaaaaccaxc tgcatttggc agaxggxttx xxxcxataax 2460 

acagccacaa tcggtrtctt atttagcaaa tcccccaaat actttgttxa ttttgcactt 2520 

xtxxaagaat gxxctttgca ttcxxxtcgg ctatactaat aacacxctat tgacaggagg 2580 

gatxgggatg aaxcatgaaa cgxxcxxaaa acgggctgxc acxctcgcat gxgaaggagx 2640 

gaatgcagga atcggcgggc cxttxggagc cgttatcgtg aaagacggag ccaxxattgc 2700 

agagggacag aacaacgtca caacaagcaa tgaxccgacx gcccacgcgg aagtcacagc 2760 

tatxcggaaa gcctgtaagg tgcxaggagc cxaccagctt gatgactgca xttxgtatac 2820 

gagcxgtgaa ccatgcccaa xgtgcxtggg cgccaxcxac tgggcccggc ctaaagccgt 2880 

tttctargca gcxgagcaca cagacgctgc cgaagccggg rttrgatgaxt carxcattta 2940 

taaagaaatx gataaacctg ctgaagaaag aacgaxcccc ttxxatcaag xgacacxaac 3000 

agagcaxtta xccccgxxxc aagcaxggcg gaacttcgcc aataagaaag aaxatxaaaa 3060 

ggatcaggca xgcgcggcct ggxccxxgxt atxxcxccaa gtagccgcta xgccctgxgc 3120 

aaatacaaaa cagcaxatac gcgcaaxxca gcacggcaga caccgtgcca gccacccgcx 3180 

xcatctgxaa ctxxxggttt aaaggcaxgc txcaaacgct tcxctgaagx trttatcaxaa 3240 

axctgtgccc gccccgcatg tccgacacca aaaaacatcc xgagaatccx caggaxgccg 3300 

gtcattaxtt taatxcxagt xrcacatcaa catttccxct ggttgccxtr gagtaaggac 3360 

page 3 



31. OCT. 2003 16:11 



NOVOZYMES PATENTS 



NO. 3565 P. 



Ol-SQ Listing-31 Oct M03.ST25.Xxt: 

agaattcatg agcggcgtxg acaagcxcrt gtgccttttc ccgatctaaa xctttcgtgx 3420 

tcacaacaag xgtgacaccg atttxaaacc cgccgxcgct ctcatccttc atgaggctga 3480 

ccxgcccttc aaxctccgaa tcaaxxtcga xattctgctc tttggctacg xgttcgagcg 3S40 

cgccgccgaa gcatgeagca taccccgccg caaagagctg ttccggattt gtrgccggttt 3600 

gxccttcttt tttggcattt ggcatgacaa tatcaaaatc aagaacaccg tcatctgatg 3660 

taaxargtcc tgetcgtccg cctcgcgcgg ttacttxtgc tgxaaatagx gccaratttc 3720 

ccaaccxcct tatttgtatc tagrcgttat atttccctrc ctgatctittrt taaacatgct 3780 

atgxttgccg agaataggaa aagtgaggrt tttcagatac aatagaatcg aaxgacaaaa 3840 

aagagrtggt gaacaaaatg gaaaataaat ttgatcatat gcggtgtgaa ataccgcaca 3900 

gaxgcgtaag gagaaaaxac cgcatcaggc gctctxccgc ttcctcgcrc actgactcgc 3960 

tgcgctcggt cgttcggcxg cggcgagcgg tatcagcxea ctcaaaggcg gtaatacggt 4020 

tatccacaga axcaggggax aacgcaggaa agaacatgxg agcaaaaggc cagcaaaagg 4080 

ccaggaaccg taaaaaggcc gcgtrtgcxgg cgtttttcca taggctccgc ccccctgacg 4140 

agcatcacaa aaaxcgacgc tcaagxcaga ggtggcgaaa cccgacagga ctataaagax 4200 

accaggcgtt tccccctgga agcrccctcg tgcgcrctcc tgttccgacc ctgccgctta 4260 

ccggatacct greegccrtt ctcccttcgg gaagcgxggc gcrttctcar agctcacgct 4320 

gtaggratct cagttcggtg taggtcgttc gctccaagct gggctgtgtg cacgaacccc 4380 

ccgtttagcc cgaccgctgc gccxtatccg gtaactatcg tcxtgagtcc aacccggtaa 4440 

gacacgactt atcgeeactg gcagcagcca cxggtaacag gartagcaga gcgaggtatg 4500 

taggcggxgc tacagagttc txgaagtggx ggcctaacta cggctacact agaaggacag 4560 

xatttggtat ctgcgctctg ctgaagccag ttaccttcgg aaaaagagxt ggxagctcxt 4620 

gaxccggcaa acaaaccacc gcxggtagcg gtggxrttxt tgtttgcaag cagcagatxa 4680 

cgcgcagaaa aaaaggatct caagaagatc ctttgatctt txctacgggg tctgacgctc 4740 

agtggaacga aaacxcacgr taagggattt xggtcargag attaxcaaaa aggatcxxca 4800 

cctagatcct trtaaatxaa aaatgaagtt xxaaatcaat ctaaagtata xatgagtaaa 4860 

cttggtcxga cagttaccaa xgcttaarca gtgaggcacc tatctcagcg axcxgtctat 4920 

ttcgttcatc catagttgcc tgactccccg tcgtgtagat aactacgara cgggagggct 4980 

xaccatctgg ccccagtgct gcaatgarac cgcgagaccc acgctcaccg gctccagatt 5040 

tarcagcaax aaaccagcca gccggaaggg ccgagcgcag aagtggtcct gcaacxttat 5100 

ccgcctccat ccagtcxarfc aattgttgcc gggaagcxag agxaagtagx tcgccagxta 5160 

axagtirtgcg caacgttgtt gccaxtgcxg caggcaxcgt ggtgxcacgc xcgtcgxxtg 5220 

gtarggcttc artcagcxcc ggtxcccaac gatcaaggcg agtxacatga Xcccccaxgt 5280 

tgxgcaaaaa agcggxtagc tccttcggtc ctccgatcgt tgtcagaagt aagttggccg 5340 

cagtgxxatc acxcanggtx axggcagcac Xgcataattc tcrttactgxc atgccatccg 5400 
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xaagatgcxx ttcxgxgact ggxgagtacx caaccaagtc attctgagaa tagtgtatgc 5460 

ggcgaccgag ttgctcttgc ccggcgtcaa cacgggataa taccgcgcca caxagcagaa 5520 

ccttaaaagt gctcatcatt ggaaaacgtt cxxcggggcg aaaactcxca aggaxcttac 5580 

cgctgttgag atccagttcg atgtaaccca ctcgxgcacc caactgatct tcagcatctt 5640 

ttactttcac cagcgtttct gggtgagcaa aaacaggaag gcaaaatgcc gcaaaaaagg 5700 

gaataagggc gaeacggaaa tgttgaatac tcatacxcxx cctttttcaa taxxattgaa 5760 

gcatttatca gggxtattgt ctcatgagcg gatacatatt tgaatgtaxx tagaaaaata 5820 

aacaaaxagg ggttccgcgc acatttcccc gaaaagtgcc acctgacgtc xaagaaacca 5860 

xxatrtatcax gacattaacc tataaaaata ggcgtatcac gaggcccttx cgxcxxcaag 5940 

aaxtctcatg tttgacagct xaxcaxcgat aagctttaat gcggtagttt axcacagtta 6000 

aattgctaac gcagxcaggc accgtgtatg aaatrcxaaca atgcgcteat cgtcatcctc 6060 

ggcaccgtca ccctggatgc tgtaggcaxa ggctxggtta tgccggtacc gccgggcctc 6120 

ttgcgggata tcgtccattc cgacagcatc gccagtcact atggcgtgct gcxagcgcxa 6180 

xatgcgttga xgcaatttcx atgcgcaccc gxtctcggag caetgxccga ccgcttxggc 6240 

cgccgcccag trcctgctcgc ttcgctactt ggagccacta tcgactacgc gatrcatggcg 6300 

aqcacacccg tcctgtggat ccxctacgcc ggaegcaxcg tggccggcax caccggcgcc 6360 

acaggtgcgg ttgctggcgc cxaxatcgcc gacatcaccg atggggaaga xcgggctcgc 6420 

cacttcgggc tcatgagcgc xtgxxxcggc gtgggtaxgg tggcaggccc cgtggccggg 6480 

ggacxgxxgg gcgccatctc ctxgcaxgca ccaxtccxxg cggcggcggx gctcaacggc 6540 

ctc&accxac tacxgggctg cxxcctaaxg caggagxcgc ataagggaga gtgtcgaccg 6600 

axgccctxga gagccxxcaa cccagtcagc xccxtccggx gggcgcgggg caxgactatc 6660 

gtcgccgcac xxatrgactgx ctrtctxxaxc atgcaactcg taggacaggx gccggcagcg 6720 

ctctgggtca xxxxcggcga ggaccgctxt cgctggagcg cgacgatgax cggcctgxcg 6780 

ctxgcggxat xcggaatctt gcacgcccxc gctcaagcct tcgxcactgg xcccgccacc 6840 

aaacgxxxcg gcgagaagca ggccaxtaxc gccggcatgg cggccgacgc gcxgggctac 6900 

gxcttgctgg cgttcgcgac gcgaggctgg atggccttcc eeattatgat tctxctcgcr 6960 

tccggcggca xcgggatgcc cgcgttgcag gccaxgctgt ccaggcaggt agatgacgac 7020 

catcagggac agcxtcaagg atcgctcgcg gctcttacca gcctaacxtc gaxcactgga 7080 

ccgcxgaxcg xcacggcgat ttaxgccgcc xcggcgagca catggaacgg gxxggcatgg 7140 

atxgxaggcg ccgccctaxa ccttgtcxgc cxccccgegt tgcgtcgcgg xgcatggagc 7200 

cgggccacct cgacctgaax ggaagccggc ggcacctcgc taacggaxtc accactccaa 7260 

gaattggagc caatcaattc xxgcggagaa cxgtgaatgc gcaaaccaac c 7311 



<210> 6 
<211> 13129 
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<112> DNA 

<2I3> artificial sequence 
<220> 

<223> Plasmid pcu>1154 
<220> 

<221> misc_feature 
<222> (5067).. (5067) 
<223> n is a, c, g> or t 

<400> ft 

aacaaaattc tccagxcxxc acaxcggxtrt gaaaggagga agcggaagaa tgaagtaaga 60 

gggaxxxttg acxccgaagx aagtcttcaa aaaatcaaat aaggagtgxc aagaatgxtrt 120 

gcaaaacgat tcaaaacctc xxxactgccg ttattcgctg gattrttatt gctgtxxcax 180 

ttggttctgg caggaccggc ggcxgcgagt gctgaaacgg egaacaaatc gaaxgagctx 240 

acagcaccgt cgatcaaaag cggaaccaxt cttcatgcat ggaattggtc gttcaaxacg 300 

xtaaaacaca atatgaagga xaxtcaxgax gcaggatata cagccattca gacatctccg 360 

atraaccaag traaaggaagg gaatcaagga gataaaagca tgtcgaactg gxactggctg 420 

tatcagccga catcgtarca aaxxggcaac cgttacttag gtactgaaca agaaxttaaa 4B0 

gaaatgtgtg cagccgctga agaatatggc ataaaggtca txgttgacgc gcggccgcgg 540 

atetaaaxxa tactagctgt gtctgctgtg ctctrgcctg txtxgcggct tcaaccaxat 600 

tttxcaatgc xgcaaccgxx t:ctrcccxgct gtcttgtttt caatccgcag tctggattta 660 

cccagaagcg gtcagtcgga cagacggcaa gcgcaxcaac gaxaatatxg tacattxott 720 

cagttgacgg eaeacgaggg ctgtgaatgt catatacacc aaggccaagc cctttcaaat 780 

acgggtggtt rttxaagxaa tctaaaaatc ctccgtggct tcxgcxatgx tcgaxtgtaa 840 

tcacatcggc axcaagaxca irtgaxxgxat caacgatatc xtcgaagtxg cxgxagcaca 900 

-fcatgxgxatg aatttgtgtc xcgttttxca cggaagaagt ggttaatcxg aaagcxxctg 960 

ccgcccaagt caaaxactca tcccaatcgc gggtttxcaa tggaaggcct tcacgcagcg 1020 

ctggxxcatc gacxtgaatg atrtrtgaatgc ctgcgtcttc aagcgcxtxa acxtcxtxgc 1OB0 

gaagggcaag cccgaxxtgg aaggcgaxtt cttxccxcga gaxgtcgtxx cgagggaaag 1140 

accagxxxaa gattgxaacc gggcccgxca gcattccttt cacaxgctxg gaagtcaaxg 1200 

actgtgcgxa gactgxgxct txcactgtca xcggttcaat aaaxxcaaca tctccgtaaa 1260 

tgacxggcgg gcggacacag cgtgagccgt atgaxtgaac ccaggcatax ttagtgaagg 1320 

cgaaaccggc cagcxtttca ccgaagxaxx cgaccaxgtc tgtccgttca aattcgccgt 1380 

gaacaaggac axcaagcxcc aatxcxxcct gaatatcaax ccatctxttx gxxxccgcax 1440 

xgaxaaagtt txgatactgx xcatcggacc acxcagcttx ccgccaxxxx tggcgxgccc 1500 

tccgcacttc agcagacxgc gggaagctgc cgatcgxtgt cgtcggcaaa agcggaaggc 1560 

cgagagattc axxttgtagg gctaaacgxt ctxcaaacgg aatcgggcgc ttgaagxctt 1620 

taxcagttaa cxgcxcaagc xcttrcrxtt gtxcagaatt ggcgcctgtx gcaaacxgxt 1680 

page 6 



r 



31 OCT. 2003 16: 1 1 NOVOZYMES PATENTS 



NO. 3565 P. 



01- SQ Listing-31 Oct 2O03.ST25.txt 

taagcgcctg gaxatcagcc ttagccxgct gaaxctcttc gctgaxcgcc gctxrtcctg 1740 

atactaagcc ttcxxtcaaa gctgtcagct cggccagctt ttcttttgcg taggataatc 1800 

cgxtcaatag gtcxxxxrtcc aaatgctcat cagggtgxtt cgcxactgga acatcgagca 1860 

ggcxgctgga aggctgaaxc cacagttcat caacttttgc aatgctgaga acateaagaa 1920 

cggcaxcgag acxctetxca aggtccgctt tccaaatgxx gcgtccgxcg ataacgccgg 1980 

ctgccagcac ttxatctgtc gggaagccat gtgttttaag ctgxtccagg tttctgcctt 2040 

tgxcgxgaac gaaatcaagg ccaatxccct gaaccgggta agagaxcagc xcxtcataag 2100 

catcaacaga axcaaaatac gxctgcaaaa gcacattcaa ggatgaaagc tcacxtgxaa 2160 

tgctttcaaa taattctttt gegccgcgta catcrtcact agaggcggta acgagcgccg 2220 

gcxcaxcgat ttgaacccat xxxacgcctt ctxctxcaag ctctttcaaa agctgxacat 2260 

ataatggcac aaggcgxtrtt tggaxegctx trgcttcaga cggxxcatag cctttagcaa 2340 

gcgtaacgaa cgtataaggg ccgacaatca caggcttxgx ttccacaccg xaxxcctgxt 2400 

rgatccggcg ataatcttcg agttgxxxgt ttctxgtcag acggaacxca atgctctcgt 2460 

catattecgg aacgatgtaa tggxaatrttg tatrtaaacca tttxgtcatt tcactagata 2520 

cagcgxctxt gaxtccgcgg gcgaxagcga agxatgtatc ggtagcgtca gtcaaaxgtc 2580 

rgaaccgrtt cgggaxccag trtgaagcxga ctgctgtgtc gagtacatigg tcaxactgtg 2640 

xgaaaxcaga aacaggcaca acaxcaatcx getggxcaax ttgtgttttt actgcggata 2700 

aaaatagtrc gtcgaxxxgc ttcaaaaacg xatcttxatc agtactgcct xxccaatacg 2760 

cttcaagtgc rtxttxccat tcccggttca ggrcgattct cggaaatcct aaattcgatg 2820 

ttttgatggt tgtcattgtx tctcctccag atccgtcgac ctgcataaac tgcatccctt 2880 

aacttgtxxx atttgtatag ttcatccatg ccatgtgtaa tcccagcagc tgttacaaac 2940 

tcaagaagga xcatgtgaxc tctxxxttcg xxgggaxcxx tggaaagggc agattgcgxg 3000 

gacaggxaat ggxxgtctgg raaaaggaca gggccatcgc caattggagt attxtgttga 3060 

xaatggxcxg ctaaxtgaac gcttccaxcx ttaaxgxtgt gxctaatttx gaagxxaact 3120 

xxgatgccat tertrggttt gtctgccatg atgtaxacat tatgtgagtx ataattgtax 3180 

xccattttgx gtccaagaax gtxxccatct xcttxaaaat caaxaccttt xaactcgaxx 3240 

ctaxxaacaa gggtaxcacc ttcaaacttg acxxcagcac gtgtcxxgta gttcccgxca 3300 

xcxttgxaaa atatagxtct rtcctgtaca xaacctxcgg gcatggcact ctrtgaaaaag 3360 

tcaxgctgrt xcatatgaxc tgggxaxcta gaaaagcaxx gaacaccara agagagagta 3420 

gtgacaagcg trggccatgg aacaggtagc txcccagtag xgcaaataaa tttaagggta 3480 

agtttxccgt atgxtgcatc acctxcaccc tctccactaa cagagaaxtx ttgcccatta 3540 

acatcgccat ctaaxteaac aagaattggg acaactccag tgaaaagttc Xtctccxrta 3600 

cxcaxaaagc rtccctccxa gctxttaxxc aataxcattt acatatcata ctaaaaxtaa 3660 

aggctaaagg gaaacgaxgt cxaacgaaaa aaaggccaaa tcaxgtxxgg cctrttggcgg 3720 
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xtattxcgat gattgtcccg aattctggcc cxtaaggcca axtcxcatgt xxgacagctt 3780 

axcaxcggca axagtxaccc rtattaxcaa gataagaaag aaaaggattt xtcgcxacgc 3840 

tcaaatcctt taaaaaaaca caaaagacca catxxxxtaa xgtggtcttt axxctxcaac 3900 

taaagcaccc arragttcaa caaacgaaaa ttggataaag tgggatattt ttaaaatata 3960 

xaxtxaxgxt acagxaatat xgacttxtaa aaaaggaxxg attcxaatga agaaagcaga 4020 

caagtaagcc tcctaaattc actrcagata aaaatttagg aggcatatca aaxgaacxtt 4080 

aataaaattg axttagacaa xxggaagaga aaagagatat ttaaxcatxa xxxgaaccaa 4140 

caaacgacxx ttagtataac cacagaaatt gatattagtg xxtxataccg aaacataaaa 4200 

caagaaggat axaaatttta ccctgcatrtrt atrttcrtag xgacaagggt gataaacxca 4260 

aatacagctt txagaactgg cracaatagc gacggagagt taggtrtattg ggataagxxa 4320 

gagccacxxx axacaaxxxx tgatggtgta tctaaaacat xcxctggxax rtggacxccx 4380 

gtaaagaatg acxxcaaaga gxtttaxgat ttataccttx cxgaxgtaga gaaatataar 4440 

ggttcgggga aattgttxcc caaaacacct atacctgaaa atgctrtttxc tctttctatt 4500 

attccargga cttcattrtac tgggttxaac xxaaatatca ataataatag taaxxacctt 4560 

ctacccatta ttacagcagg aaaaxtcatt aataaaggta attcaaxaxa xxtaccgcxa 4620 

tctttacagg tacarcattc tgrttgtgat ggttatcatg caggattgtt xatgaactct 4660 

axtcaggaat tgtcagatag gcctaatgac xggctttxat aatatgagat aatgccgacx 4740 

gtacxxxxta cagxcggtxx xcraatgxca ctaacctgcc ccgxtagttg aagaaggtxt 4800 

rtatattaca gctccagatc ctctacgccg gacgcatcgx ggccggcatc accggcgcca 4860 

caggtgcggt tgctggcgcc taxatcgccg acatcaccga xggggaagax cgggctcgcc 4920 

acttcgggct caxgagcgct tgtttcggcg xgggtaxggt ggcaggcccc gtggccgggg 4980 

gactgttggg cgccatctcc ttgcatgccc agaaatttat ccttaagctg gattcaggaa 5040 

gaggggcgtt xgacaggaag ggggagnagg catataatga gaxgagracx gttaactggg 5100 

caggatggat ccccagctxg xtgatacacx aatgctxtta xaxagggaaa aggxggtgaa 5160 

ctaccgxgga agtxactgac gtaagaxxac gggxcgaccg ggaaaaccct ggcgxtaccc 5220 

aactxaaxcg cctXgcagca caxccccctt xcgccagcxg gcgtaaxagc gaagaggccc 5280 

gcaccgaxcg cccttcccaa cagtxgcgca gccxgaatgg cgaatggcgc tttgccxggx 5340 

ttccggcacc agaagcggtg ccggaaagct ggctggagxg cgatctxcct gaggccgata 5400 

cxgtcgtcgt cccctcaaac tggcagatgc acggxxacga xgcgcccatc tacaccaacg 5460 

taacctaxcc caxtacggtc aatxcgccgt ttgxxcccac ggagaatccg acgggttgxx 5520 

acxcgctcac axttaaxgxx gatgaaagtt ggctacagga aggccagacg cgaaxtaxxx 5580 

xxgatggcgt xaacxcggcg tttcaxcxgt ggtgcaacgg gcgctgggtc ggttacggcc 5640 

aggacagxcg txxgccgtct gaattrgacc xgagcgcaxt tttacgcgcc ggagaaaacc 5700 

gcctcgcggt gaxggrgctg cgcxggagtg acggcagtxa tctggaagat caggaxatgx 5760 
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ggcggatgag cggcattttc cgtgacgxct cgxtgctgca taaaccgact acacaaatca 5820 

gcgaxxxcca tgtxgccact cgcxtxaatg atgatxtcag ccgcgctgta ctggaggctg 5B80 

aagxtcagax gtgcggcgag xxgcgtgacx acctacgggt aacagttxcx xtatggcagg 5940 

gtgaaacgca ggtcgccagc ggcaccgcgc ctttcggcgg tgaaattatc gatgagcgxg 6000 

gtggxtatgc cgatcgcgtc acactacgtc tgaacgtcga aaacccgaaa ctgxggagcg 6060 

ccgaaatccc gaaxctctat cgtgcggxgg tcgaactgca caccgccgac ggcacgctga 6120 

txgaagcaga agcctgcgat gtcggttxcc gcgaggtgcg gattgaaaat ggtctgctgc 6180 

tgcrgaacgg caagccgtxg ctgaxtcgag gcgttaaccg tcacgagcat catcctcxgc 6240 

atggtcaggt carggatgag cagacgargg tgcaggatat cctgctgaxg aagcagaaca 6300 

actxxaacgc cgxgcgcxgx tcgcattatc cgaaccatcc gctgtggxac acgctgtgcg 6360 

accgctacgg cctgtatgtg gxggatgaag ccaatattga aacccacggc atggtgccaa 6420 

tgaaxcgtcx gaccgaxgat ccgcgctggc taccggcgat gagcgaacgc gtaacgcgaa 6480 

tggtgcagcg cgatcgtaat cacccgagtg tgatcatcxg gtcgctgggg aatgaatcag 6540 

gccacggcgc xaaxcacgac gcgctgtatc gctggaXcaa atctgtcgat cctxcccgcc 6600 

cggxgcagta xgaaggcggc ggagccgaca ccacggccac cgatatratt xgcccgatgx 6660 

acgcgcgcgt ggatgaagac cagcccttcc cggctgtgcc gaaatggtcc atcaaaaaat 6720 

ggctttcget acctggagag acgcgcccgc tgatcetttg cgaatacgcc cacgcgatgg 6780 

gxaacagxct xggcggtxxc gccaaatact ggcaggcgtt tcgtcagtat ccccgxxxac 6840 

agggcggcxr cgtctgggac xgggxggatc agtcgctgat taaaxatgat gaaaacggca 6900 

acccgtggtc ggcttacggc ggtgattttg gcgatacgcc gaacgaxcgc cagttctgta 6960 

xgaacggxct ggtctttgcc gaccgcacgc cgcatccagc gctgacggaa gcaaaacacc 7020 

agcagcagxx txtccagxxc cgtxtatccg ggcaaaccax cgaagxgacc agcgaatacc 7080 

xgxtccgtca tagcgaxaac gagctcetgc acxggaxggX ggcgctggat ggtaagccgc 7140 

tggcaagcgg xgaagtgccx ctggatgtcg ctccacaagg taaacagxtg attgaactgc 7200 

ctgaactacc gcagccggag agcgccgggc aactctggct cacagtacgc gtagtgcaac 7260 

cgaacgcgac cgcaxggtca gaagccgggc acatcagcgc crggcagcag xggcgtctgg 7320 

cggaaaaccc cagtgtgacg ctccccgccg cgtcceacgc catcccgcat cxgaccacca 7380 

gcgaaaxgga tttxtgcatc gagcxgggxa ataagcgttg gcaatttaac cgccagxcag 7440 

gctttcrctc acagaxgxgg axtggcgata aaaaacaact gctgacgccg cxgcgcgatc 7500 

agttcacccg tgcaccgctg gataacgaca xxggcgtaag tgaagcgacc cgcattgacc 7560 

ctaacgcctg ggtcgaacgc tggaaggcgg cgggccatta ccaggccgaa gcagcgttgt 7620 

tgcagxgcac ggcagataca cxtgctgaxg cggxgcxgat tacgaccgct cacgcgtggc 7680 

agcaxcaggg gaaaaccxxa ttxatcagcc ggaaaaccxa ccggaxxgat ggxagtggxc 7740 

aaatggcgax xaccgxxgax gxxgaagtgg cgagcgatac accgcaxccg gcgcggatxg 7800 
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gcctgaactg ccagctggcg caggtagcag agcgggtaaa ctggctcgga xtagggccgc 78GO 

aagaaaacta tcccgacegc cttactgceg cctgrcttga ccgctgggat ctgccattgt 7920 

cagacatgta taccccgtac gtcttcccga gcgaaaacgg tctgcgctgc gggacgcgcg 7980 

aattgaatta tggcccacac cagxggcgcg gcgacttcca gtteaacatc agccgctaca 8040 

gtcaacagca actgatggaa accagccatc gccatctgct gcacgcggaa gaaggcacat 8100 

ggctgaatat cgacggtttc catatgggga ttggtggcga cgactcctgg agcccgtcag 6160 

tatcggcgga attacagctg agcgccggtc gcxaccatta ccagttggtc tggtgtcaaa 8220 

aataagcatg caccaxxcct tgcggcggcg gtgctcaacg gcctcaacct acxaccgggc 8280 

tgcttccxaa tgcaggagtc gcaraaggga gagcgtcgac atggatgagc gatgatgata 8340 

xccgtttagg ctgggcggtg atagczzcvc gttcaggcag tacgcctctt ttcttttcca 8400 

gacctgaggg aggcggaaat ggxgtgaggt tcccggggaa aagccaaata ggcgatcgcg 8460 

ggagtgcttt atttgaagat caggctarca ctgcggtcaa tagatttcac aatgtgatgg 8520 

ctggacagcc tgaggaactc tcgaacccga atggaaacaa ccagatattt atgaatcage 8580 

gcggctcaca tggcgttgtg ctggcaaatg caggttcatc ctctgtctct atcaatacgg 8640 

caacaaaatt gcctgatggc aggtatgaca ataaagrtgg agcgggttca tttcaagtga 8700 

acgatggtaa actgacaggc acgatxaatg ccaggtctgt agctgtgctt tatcctgatg 8760 

arattgcaaa agcgcctcat gttttccttg agaattacaa aacaggtgta acacattctr 8820 

tcaatgatca actgacgatt accttgcgtg cagatgcgaa tacaacaaaa gccgtttatc 8880 

aaatcaataa tggaccagac gacaggcgtt taaggatgga gatcaattca caatcggaaa 8940 

aggagatcca atttggcaaa acatacacca tcatgttaaa aggaacgaac agtgatggtg 9000 

taacgaggac cgagaaatac agttttgtta aaagagatcc agcgtcggcc aaaaccatcg 9060 

gctatcaaaa rccgaatcat tggagccagg taaatgctta tarctataaa catgatggga 9120 

gccgagtaat tgaattgacc ggatcttggc ctggaaaacc aatgactaaa aatgcagaeg 9180 

gaatttacac gctgacgctg cctgcggaca cggatacaac eaacgcaaaa gtgattttta 9240 

araatggcag cgcccaagtg cccggtcaga atcagcctgg ctttgattac gtgctaaatg 9300 

gtttatataa tgactcgggc ttaagcggtt ctcttcccca ttgagggcaa ggctagacgg 9360 

gacttaccga aagaaaccat caatgatggt ttcttrtttg ttcataaatc agacaaaact 9420 

tttctcttgc aaaagtttgt gaagtgttgc acaatataaa tgtgaaatac ttcacaaaca 9480 

aaaagacatc aaagagaaac ataccctgca aggatgctga xartgtctgc atrtgcgccg 9540 

gagcaaacca aaaacctggt gagacacgcc ttgaartagt agaaaagaac ttgaagattt 9600 

tcaaaggcat cgttagtgaa gtcatggcga gcggatttga cggcattttc ttagtcggta 9660 

acaatcctcg ttaaaggaca aggacctgag cggaagtgta tcgtacagta gacggagtat 9720 

actagtatag tctatagtcc gtggaattat xatatttatc tccgacgata ttctcatcag 9780 

tgaaatccag ctggagttct rtagcaaatt tttttattag cxgaacttag tartagtggg 9840 
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gccgcxgata axxactaaxa cxaggagaag ttaataaata cgnaaccaac atgaxtaaca 
attattagag gtcatcgxtc aaaatggtat gcgtxxtgac acaxccacta taxatccgtg 
tcgttcxgxc cactcctgaa xeccatxcca gaaattcxct agcgattcca gaagxttctc 
agagtcggaa agttgaccag acattacgaa ctggcacaga tggtcaxaac ctgaaggaag 
atxtgaxxgc xtaactgctt cagrtaagac cgaagcgctc gtcgtaxaac agatgcgatg 
atgcagacca atcaacaxgg cacctgccat tgctacetgt acagtcaagg atggtagaaa 10200 
tgxxgtcggt ccttgcacac gaataxtacg ccatttgcct gcatattcaa acagctcttc 10260 
tacgataagg gcacaaatcg catcgtggaa cgrctgggct xcxaecgaxt tagcagtrtg 
axacacxxxc tctaagtatc cacctgaatc ataaaxcggc aaaaxagaga aaaattgacc 
atgtgtaagc ggccaatctg attccacctg agatgcataa tctagtagaa tctcttcgct 
atcaaaattc acttccaccx tccactcacc ggttgtccat tcatggctga actctgcttc 
ctctgtrtgac axgacacaca xcatctcaax atccgaatag ggcccaxcag tctgacgacc 
aagagagcca taaacaccaa tagccttaac axcatcccca tatttatcca arattcgttc 
cxxaaxttca xgaacaatct xcatxctxxc ttctctagtc attattattg gtccaxxcac 
tattcxcaxt ccctttxcag araaxxttag atxxgctttt cxaaataaga axatttggag 
agcaccgttc xtattcagcx attaataacx cgtcttccta agcaxcatgg rcxcactttt 
ccactxttxg xcttgtccac xaaaacccxx gattxxtcat ctgaaxaaat gctactatta 
ggacacataa tattaaaaga aacccccaxc tatttagxta tttgttxagt cactxataac 
xttaacagat ggggtrttttc xgxgcaacca attttaaggg xtttcaaxac tttaaaacac 
atacaxacca acacxxcaac gcaccxttca gcaactaaaa xaaaaatgac gxxattxcta 
taxgXatcaa gaxaagaaag aacaagttca aaaccatcaa aaaaagacac ctxtxcaggt 11100 
gcttxxxxxa ttxxaxaaac tcatxccctg atcxccccat acxcctccaa xccaaagcta 11160 
xttagaaaga ttactataxc ctcaaacagg cggtaaccgg cctcxxcatc gggaaxgcgc 
gcgaccttca gcatcgccgg catgtccccc tggcggacgg gaagxaxcca gcxcgaggtc 
gggccgcgxx gcxggcgttt xxccataggc xccgcccccc xgacgagcat cacaaaaaxc 
gacgcxcaag tcagaggtgg cgaaacccga caggactaxa aagataccag gegtttcccc 
ctggaagctc ectcgtgcgc xctccrgxtc cgaccctgcc gcttaccgga tacctgtccg 
ccttxctccc ttcgggaagc gtggcgcttt cxcatagctc acgctgtagg tatctcagxx 
cggtgtaggx cgttcgctcc aagctgggct gtgxgcacga accccccgtx cagcccgacc 
gcrgcgcctt atccggtaac tatcgtcttg agtccaaccc ggtaagacac gacxxatcgc 
cacxggcagc agccactggx aacaggarca gcagagcgag gtatgxaggc ggtgctacag 
agxtctxgaa gxggxggcct aactacggct acactagaag gacagxaxxx ggtaxcxgcg 
cxctgctgaa gccagttacc xxcggaaaaa gagttggtag cxctxgatcc ggcaaacaaa 



9900 
9960 
10020 
10080 
10140 



10320 

1038O 

10440 

10500 

10560 

10620 

10680 

10740 

10800 

10860 

10920 

10980 

11040 



ccaccgctgg xagcggxggt ttxxxxgttrt gcaagcagca gaXXacgcgc agaaaaaaag 
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gatctcaaga agatcctttg atctttrtcta cggggtctga cgctcagtgg aacgaaaact 
cacgttaagg gattttggtc atgagattat caaaaaggat cttcacctag atccttttaa 
attaaaaatg aagttttaaa tcaatctaaa gtatatatga gtaaacttgg tctgacagtr 
accaatgctt aatcagtgag gcacctatct cagcgatctg tctatttcgt tcatccatag 
ttgcctgact ccccgtcgtg tagataacta cgatacggga gggcttacca tctggcccca 
grgctgcaat gataccgcga gacccacgct: caccggctcc agarttatca gcaataaacc 
agccagccgg aagggccgag cgcagaagtg gtcctgcaac tttatccgcc tccatccagt 
ctattaattg ttgccgggaa gctagagtaa gtagttcgcc agttaatagt ttgcgcaacg 
ttgttgccat tgctgcaggc atcgtggtgt cacgctcgtc gtttggtatg gcttcattca 
gctccggttc ccaacgatca aggcgagtta catgatcccc catgttgtgc aaaaaagcgg 
ttagctcctt cggtcctccg atcgttgtca gaagtaagtt ggccgcagtg ttatcactca 
tggttatgge agcactgcat aattctctta ctgtcatgcc atccgxaaga tgcctttctg 
tgactggtga gtactcaacc aagtcattet gagaatagtg tatgcggcga ccgagttgct 
cttgcccggc gtcaacacgg gataataccg cgccacatag cagaacttta aaagtgccca 
tcattggaaa acgttcttcg gggcgaaaac tctcaaggat cttaccgctg ttgagatcca 
gttcgatgta acccactcgt gcacccaact gatcttcagc atcttttact ttcaccagcg 
tttctgggtg agcaaaaaca ggaaggcaaa atgccgcaaa aaagggaata agggcgacac 
ggaaatgttg aatactcaxa ctcttccttt ttcaatatta ttgaagcatt tatcagggtt 
attgtctcat gagcggatac atatttgaat gtatttagaa aaataaacaa ataggggttc 
cgcgcacatt tccccgaaaa gtgccacctg acgtctaaga aaccattaxt atcatgacat 
taacctataa aaataggcgt atcacgaggc cctttcgtct tcaagaatt 



11940 

12000 

12060 

12120 

12180 

12240 

12300 

12360 

12420 

12480 

12540 

12600 

12660 

12720 

12780 

12840 

12900 

12960 

13020 

13080 

13129 



<210> 
<211> 
<212> 
<213> 

<220> 
<223> 



9 

87 

DNA 

artificial sequence 
promoter pr30 



gaattcatgc atcgcggagg tgagatttga cactagtagg ctacgggact ataatgcggg 
aagtactgxt aactgcagga taagctx 



60 
87 



<210> 
<211> 
<212> 
*213> 

<220> 
<223> 



10 
86 
DNA 

artificial sequence 



Promoter PR43 



gaartcatgc attcgaattt ggaaatcgac aggagcgggc gggtagggta taataratgt 
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agtactgtta actgcaggat aagctt 86 

<210> 11 

<211> 87 

<212> DNA , 

<213> artificial sequence 

<220> 

<223> Promoter PR119 

gaattcaSc atcgagcgga agtrcgttga cacagctcca ggatacaaat ataatgggtc 60 
gagtactgtt aactgcagga taagcrt s7 

<210> 12 
<211> 86 
<212> DNA 

<213> artificial sequence 
*220> 

<223> Promoter PR164 

gaattcatgc atggacagtt cgtctttgac aaatctaaga aagggaacta taatgtgggc 60 
agtactgtta actgcaggat aagctt 86 

<210> 13 
<211> 87 
<212> DNA 

<213> artificial sequence 
<220> 

<223> Promoter PR342 

gaattcatgc atgcggatgg aaggggttga caccggcgcc gggtccaggt ataatcttga 60 
cagtactgtt aactgcagga t aagctt 87 

<210> 14 
<211> 87 
<212> DNA 

<213> artificial sequence 
<220> 

<223> promoter PR409 

gaattcatgc atagaggagt ttattcttga caaatgcgag gcagaatggt ataatacgta 60 
gagtactgtt aactgcagga taagctt 87 

<210> 15 

<211> 87 

<212> ONA 

<21B> artificial sequence 



<220> 

<223> Promoter PR519 
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Saatteatgc atagtaaagt ttattcttga caagaattgg cgcgggtgat ataataaata 60 
cagtactgtrt aactgcagga taagctt 



87 



<210> 16 

<211> 2591 

<212> ONA 

<213> Bacillus subtil is 



<220> 

<221> CDS 

<222> C997)..«19S» 

<223> MetE coding sequence 



ggatceg"t tgaaatgcgg agcagaaaga atggtgaacg gcxgctcaac tgttttctta 60 
tcaxttcctt ccggacggax aaacagctgc cgcgcaaata aattgtgcca tgcgaactca 
tttacgacag tgatcggcag cctgtatttc tcgtctgctc cggcaaatcc ttcgaaaaca 
aacagttcat ctcgctcctt taaatagctg acaaetttcg tgtacagccg ctcaaacgct 240 
tcrtctgaaa tcggctgatt caccgggccc caatcgatct tatttttcgt gctttctccc 
tccacgatga attxatcttt aggtgagcgx eetgtgtaag cgcctgrttgt cgcgcgaaca 
gcacctgtgg atgttaaaat gccttcgtrtt cgggagagga ctttttctgt tagctgtgct 
gctgataaat tatgacgcac atrcggacat gttaataagg ctxgtgaatc agcggtcaaa 
tcaaetgagt tcatatgaaa ccttccttta tcgttttttg tgttttgcta attgtgaatt 
agtataacat atattttcaa atagtctata ctatttattg ttttttgtgt gtgcatttcc 
attgttttcc ctcaatatag gtgcctattt cttctgaatc atattgacat tgcaaaccct 
txtacgataa gatatttcax tgagcggata ctcttateec gagctggcgg agggacaggc 
cctatgaagc ccagcaaccg gtttctctgt tatttattat gttcaattga gtgagacaac 
caaggtgcra acctgtxgca aggttgtatg attcettgag cgataagagt gaaaggcaca 
aagaccaaac cctttcctcg atggaaaagg tttttttatt tcataaatat gccaattaac 
attctctaat ataactgtac attgtataag agggagcgag ttccgtatca tatatacaag 
gtctttcggg aggccttgtg caggaggaag caaatc atg agt aaa aat cgt cgt 

1 5 

tta ttt aca tea gaa tct gtt acg gag ggg tat ccg gat aaa ate tgt 
Leu Phe Thr ser Glu ser val Thr Glu Gly Kis Pro Asp Lys rie cys 



10 15 



120 
180 



300 
360 
420 
480 
SAO 
600 
660 
720 
780 
840 
900 
960 
1014 



1062 
1110 



nai- ran airt rat aac aac att tta gat gaa att tta aag aaa gac cct 
IS Gin lie Tyr 25 sir "e Leu Asp Glu He Leu Lys Lys Asp pro 

aac oca cat gtt act tgt gaa aca tct grg aca aca ggt ttg gtt ctt 1158 
lln !la a?S val Ala cys Glu Thr ser SaT Thr Thr Gly Leu val teu 
40 45 50 

ata aac aaa aaq ate aca act tct acg tat gtt gac art ccg aaa acg 
val sir Ily Glu rle Thr Thr ser Thr Tyr Val asp He Pro Lys Thr 
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ate aac aca ttc ggt tea gga aaa get tct gag gaa aaa ctg act gaa 
lie Asn Thr Phe Sly ser Gly Lys Ala Ser 61 u 61 u Lys Leu He 6lu 

Page 15 



1254 
1302 
1350 
1398 



01-SQ Listing-31 Oct 2003.ST25.txt 
55 60 65 70 

ott cac caa acc art aaa gaa ate gga tac aca cgt gca aaa tac gga 
Sal Arg Gin ?hr lie Lys Glu He Ely Tyr Thr Arg Ala Lys Tyj cfy 

ret nat qcq gaa act tgt gcg gtt tea aca tea att gat gag cag tct 
Phe SIS Ala llu Thr cys Ala Val Leu Thr Ser He Asp Glu Gin ser 
9Q 95 jaju 

act aat ate gcg atg ggc gta gac cca gcg ctt gaa gec cgt gaa gac 
Ala asp lie All Met cTy Val Asp pro Ala Leu Glu Ala Arg Glu cTy 
105 110 H5 

aca arq age y< iu gaa gaa at* gaa gcg art got gcg gat gac caa gaa 
Th? Met ser Asp Glu Glu lie Glu Ala He Gly Ala Gly Asp Gin Gly 
120 125 I 30 

tta ata ttc aat tat gtg tgc aac gaa acg aaa gag ctt atg cct ctt 1446 
E2 SS £he gly Tyr §a? c?s Asn Glu Th? Lys Glu Leu Met Pro L|u 
135 140 145 

cca att tea ctt acc cat aaa tta gee cgc cgc eta agt gaa gtc cgt 
Pro III ler Ala Sis Lys Leu Ala A?g Arg Leu ser Glu val Arg 
155 160 I0 5 

aaa gaa gat att ctt ccg tac ctt cgc cct gac ggc aaa aca cag gta 
Lys Glu Asp lie Leu Pro Tyr Leu Arg Pro Asp Gly Lys Thr Gin val 
' 170 175 180 

acq att gag tac gat gaa aat aac aaa cca gtc cgc att gac gcg att 
Thr val Glu Tyr as P Glu Asn Asn Lys Pro val Arg lie Asp Ala lie 
185 190 135 

gtt att tea act cag cat cac cct gaa att aca ctt gag caa att cag 
Val He Ser Thr Gin His His Pro Glu lie Thr Leu Glu Gin lie Gin 
200 205 210 

enc aac att aaa aaa cat gta ate aat ccg gtt gtt cct gaa gag ctg 
Arg lln lie gl llu Sis val lie Asn Pro val Val Pro Glu Glu Leu 
215 220 225 



1494 
1542 
1590 
1638 
1686 
1734 



att gat gaa gaa aca aaa tat ttc ate aac cct aca gga cgt ttc gta 
lie Asp Glu Glu Thr Lys Tyr Phe lie Asn Pro Thr Gly Arg Phe Val 
235 *tO fc^o 

ate gga ggc cct caa ggg gat gcg gga ctt aca gga cgc aaa ate ate 
lie Gly Gly Pro Gin Gly Asp Ala Gly Leu Thr Gly Arg Lys He He 
250 255 260 

gtt gat acg tac ggc ggc tat gca cgc cac ggc gga ggc gcg ttc tea 
val asp Th? Tyr Gly Gly Tyr Ala Arg His Gly Gly Gly Ala Phe ser 
265 270 275 

ggt aag gac gcg acg aag gta gac cgt tct gca get tat gcg gca aga 
Gly tys Asp Ala Thr Lys val Asp Arg Ser Ala Ala Tyr Ala Ala Arg 
2BQ 285 290 

tac qtt gcg aaa aac ate gtt gcg get gag ctt get gat tct tgc gaa 1926 
Tyr val Ala Lys Asn lie val Ala Ala Glu Leu Ala Asp Ser cys Glu 
2§s 300 305 3iu 

gta cag ctt get tac gcg ate ggt gtt gca cag cct gtg tea ate tea 
val Gin Leu Ala Tyr Ala lie Gly Val Ala Gin pro val ser ll| Ser 



1830 
1878 



1974 
2022 
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Ol-sq Listing-31 Oct 2003.ST2f.txt 
330 335 340 

ott gtt cgc aat aac ttt gax tta cga cct gcc ggc att ate aaa atg 2070 
val val Arg Asn Asn Phe Asp Leu Arg pro Ala G?y lie He Lys Met 
345 350 355 

ctt qat ttg cgc cgt ccg ate tat aaa eaa act get gcg tac ggc cac 2118 
Leu Asp Leu Arg Arg PrS lie Tyr Lys Gin Thr Ala Ala Tyr Gly His 
360 365 370 

ttt aaa cat cae gat gtt gac ctt cca tgg gag cgc aca gac aaa gcg 2166 
Phi <3y Arg His Asp val Asp Leu Pro rrp Glu Arg Thr Asp Lys Ala 
375 ~ 380 385 3:yu 

gaa cag erg cgt aaa gaa guy n_a gga gaa taa xtttatagee gcttactiggt 2219 
Glu Gin Leu Arg Lys Glu Ala Leu Gly Glu 
395 4uo 

taageggett tccctttttt atcgttgtat trcatgtttat ttttttacat aactgegaaa 2279 

ccaaatacta ttcacagcgt ctataaatag gggttcaatg atgacaattt taattatgga 2339 

ggcaatacta tgtgtggatt tgtcggggtt tttaacaagc ateegttage tcaaaccgct 2399 

gatcaagaag aactaatcaa acaaatgaac caaatgatcg ttcaccgcgg tcctgacagt 2459 

gatggatiatt tccatgatga gcacgtcggc xteggattea gaeggctcag cattattgat 2519 

gtagaaaatg gtggacagcc rttatcatat gaagatgaaa catattggat tatctttaac 2579 

ggagtaaacc ta 

<Z10> 17 
<211> 400 
<212> PRT 

<213> Bacillus subtil is 
<400> 17 

Met Ser Lys Asn Arg Arg Leu phe Thr ser Glu ser val Thr Glu Gly 
15 10 " 

His pro Asp Lys lie cys Asp Gin lie Tyr Asp Ser lie Leu Asp Glu 
20 2.5 * u 

lie Leu Lys Lys Asp Pro Asn Ala Arg val Ala cys Glu Thr ser val 
35 40 45 

Thr Thr Gly Leu val Leu val Ser Gly Glu He Thr Thr ser Thr Tyr 
50 5S 60 

val asp He Pro Lys Thr val Arg Gin Thr lie Lys Glu He Gly Tyr 
65 70 75 B0 

Thr Arg Ala Lys T^r Gly Phe Asp Ala Glu Thr Cys Ala val Leu Thr 

Ser He Asp Glu Gin ser Ala Asp lie Ala Met Gly Val Asp Pro Ala 
100 105 i* 1 - 0 
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Leu Glu Ala Arg Glu Gly Thr Met Ser Asp Glu Glu lie Glu Ala lie 
115 120 125 

Gly Ala Gly as P cln Gly Leu Met Phe Gly Tyr val cys as n Glu Thr 
3*30 135 

Lys Glu Leu Met pro Leu pro lie ser Leu Ala His Lys Leu Ala Arg 
lis 150 I 55 1W 

Arg Leu ser Glu val Arg Lys Glu Asp lie Leu pro Tyr Leu Arg Pro 
165 "v A 3 

Asp Gly Lys Thr Gin val Thr Val Glu Tyr asp Glu Asn Asn Lys pro 
180 185 • LSMJ 

val Arg lie Asp Ala He Val lie ser Thr Gin His His Pro Glu He 
195 200 205 

Thr Leu Glu Gin He Gin Arg Asn lie Lys Glu His Val lie Asn Pro 
210 215 220 

val val Pro Glu Glu Leu He Asp Glu Glu Thr Lys Tyr Phe lie Asn 
225 230 235 

Pro Thr Gly Arg Phe val lie Gly Gly Pro Gin Gly Asp Ala Gly Leu 
245 250 

Thr Gly Arg Lgs lie He Val asp Thr Tyr Gly Gly Tyr Ala Arg His 

Gly Gly Gly Ala phe Ser Gly Lys asp Ala Thr Lys val Asp Arg Ser 

Ala Ala Tyr Ala Ala Arg Tyr val Ala Lys asr Tie val Ala Ala Glu 
290 295 500 

Leu Ala Asp ser cys Glu val Gin Leu Ala T^r Ala lie Gly Val Ala 

Gin Pro val Ser lie Ser lie Asn Thr Phe Gly Ser Gly Lys Al| Ser 

Glu Glu lvs Leu rle Glu val val Arg Asn Asn Phe Asp Leu Arg Pro 
7 340 345 350 

Ala Gly He He Lys Met Leu asp Leu Arg Arg Pro lie Tyr Lys Gin 
^ 355 350 a" 

Thr Ala Ala Tyr Gly His phe Gly Arg His asp val Asp Leu Pro Trp 
370 375 380 
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Glu Arg Thr asp Lys Ala Glu Gin Leu APg Ly| Glu Ala Leu Gly Glu 
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